How to Build a Google Meet Bot from Scratch

Google does not provide a public API that lets a bot join a meeting, capture audio, or pull a live transcript. 

That means the only way to build a Google Meet bot yourself is to automate a real browser. The bot opens Google Meet just like a human would, clicks through the join flow, enables captions, and reads what appears on screen.

This guide walks you through building that bot step by step using Node.js and Puppeteer. 

By the end you will have a working bot that joins a Google Meet, enables live captions, and saves a transcript to a file.

Quick Recap

  1. Create a dedicated Google account for the bot

  2. Set up a Node.js project and install Puppeteer

  3. Automate Google login using page.type() and page.click(), or load saved cookies

  4. Navigate to the Meet URL and click the Join button using text-based button matching

  5. Enable captions through the More Options menu

  6. Poll the captions DOM container every 1.5 seconds and write new lines to a file

  7. Save session cookies to avoid repeated login flows in production

What You Need Before You Start

  • Node.js version 18 or above installed on your machine
  • A dedicated Google account for the bot (do not use your personal account)
  • A Google Meet link to test with
  • Basic familiarity with JavaScript

Use a fresh Google account created specifically for this bot. Running repeated automated logins on a personal account risks triggering Google's bot detection and getting the account flagged.

How to Build a Google Meet Bot?

Step 1: Set Up Your Project

Create a new folder for the project and initialise it.

bash

mkdir meet-bot

cd meet-bot

npm init -y

Install Puppeteer. This downloads a bundled version of Chromium along with the library.

bash

npm install puppeteer

Create your main file.

bash

touch bot.js

Step 2: Launch a Browser and Log Into Google

The bot needs to be signed into a Google account before it can join a Meet. Puppeteer launches a real Chromium browser window and automates the login flow.

javascript

const puppeteer = require('puppeteer');


const GOOGLE_EMAIL    = 'your-bot-account@gmail.com';

const GOOGLE_PASSWORD = 'your-bot-password';

const MEET_URL        = 'https://meet.google.com/abc-defg-hij';


async function loginToGoogle(page) {

  await page.goto('https://accounts.google.com/signin', {

    waitUntil: 'networkidle2'

  });


  // Enter email

  await page.waitForSelector('input[type="email"]');

  await page.type('input[type="email"]', GOOGLE_EMAIL, { delay: 50 });

  await page.click('#identifierNext');


  // Enter password

  await page.waitForSelector('input[type="password"]', { visible: true });

  await page.type('input[type="password"]', GOOGLE_PASSWORD, { delay: 50 });

  await page.click('#passwordNext');


  await page.waitForNavigation({ waitUntil: 'networkidle2' });

  console.log('Logged in successfully.');

}


(async () => {

  const browser = await puppeteer.launch({

    headless: false,   // Set to true once everything works

    args: [

      '--no-sandbox',

      '--disable-setuid-sandbox',

      '--use-fake-ui-for-media-stream',   // Auto-grant mic/camera permissions

      '--use-fake-device-for-media-stream'

    ]

  });


  const page = await browser.newPage();

  await loginToGoogle(page);

})();

Run it once with headless: false so you can watch what happens and debug any issues with the login flow. Google sometimes adds extra verification steps for new accounts or unfamiliar login locations. Handle those manually the first time, then automate once the session is stable.

Step 3: Navigate to the Meeting and Join

After login, go to the Meet URL and click through the pre-join screen.

javascript

async function joinMeeting(page, meetUrl) {

  await page.goto(meetUrl, { waitUntil: 'networkidle2' });


  // Dismiss any cookie or notification popups if they appear

  try {

    await page.waitForSelector('button[data-mdc-dialog-action="accept"]',

      { timeout: 3000 });

    await page.click('button[data-mdc-dialog-action="accept"]');

  } catch (e) {

    // No popup, continue

  }


  // Turn off microphone on the pre-join screen

  try {

    const micButton = await page.$('[data-is-muted="false"][data-tooltip*="microphone"]');

    if (micButton) await micButton.click();

  } catch (e) {}


  // Click the Join button

  // Google Meet uses different button text: "Join now", "Ask to join", "Join"

  await page.waitForSelector('button[data-promo-anchor-id="join-button"], ' +

    'button[jsname="Qx7uuf"], button[data-call-pane-focus-key]',

    { timeout: 15000 });


  const buttons = await page.$$('button');

  for (const button of buttons) {

    const text = await page.evaluate(el => el.innerText, button);

    if (text.includes('Join') || text.includes('Ask to join')) {

      await button.click();

      break;

    }

  }


  console.log('Joined the meeting.');

  await page.waitForTimeout(3000);

}

A quick note on selectors: Google updates the Meet UI regularly. The selectors above work as of early 2025 but may change. If the bot fails to click the join button, open DevTools in a real Chrome window on the same page and inspect the element to find the current selector.

Step 4: Enable Live Captions

Captions are how the bot reads what people say. Enable them through the Meet toolbar.

javascript

async function enableCaptions(page) {

  // Click the "More options" button (three dots) in the bottom toolbar

  await page.waitForSelector('[data-tooltip="More options"]', { timeout: 10000 });

  await page.click('[data-tooltip="More options"]');


  await page.waitForTimeout(1000);


  // Click "Turn on captions" from the menu

  const menuItems = await page.$$('[role="menuitem"]');

  for (const item of menuItems) {

    const text = await page.evaluate(el => el.innerText, item);

    if (text.toLowerCase().includes('caption')) {

      await item.click();

      console.log('Captions enabled.');

      return;

    }

  }


  console.log('Could not find captions option in menu.');

}

Step 5: Scrape Captions and Save the Transcript

Once captions are on, they appear inside a specific DOM container. The bot watches that container and collects the text as it updates.

javascript

const fs = require('fs');


async function captureTranscript(page, durationMs = 60000) {

  const transcriptLines = [];

  const outputPath = 'transcript.txt';


  console.log(`Capturing transcript for ${durationMs / 1000} seconds...`);


  const intervalId = setInterval(async () => {

    try {

      // The captions container selector -- inspect live if this breaks

      const captionText = await page.evaluate(() => {

        const container = document.querySelector('[jsname="tgaKEf"]');

        return container ? container.innerText.trim() : '';

      });


      if (captionText && captionText !== transcriptLines[transcriptLines.length - 1]) {

        transcriptLines.push(captionText);

        fs.appendFileSync(outputPath, captionText + '\n');

        console.log('Captured:', captionText);

      }

    } catch (e) {

      // Page may be navigating or captions not visible yet

    }

  }, 1500);  // Poll every 1.5 seconds


  // Stop capturing after the specified duration

  await new Promise(resolve => setTimeout(resolve, durationMs));

  clearInterval(intervalId);


  console.log(`Transcript saved to ${outputPath}`);

  return transcriptLines;

}

The caption container selector [jsname="tgaKEf"] is the one used in Meet's DOM currently. The jsname attribute tends to be more stable than class names, but always verify it by inspecting the live page if something breaks after a Meet update.

Step 6: Put It All Together

Wire everything into a single run function.

javascript

const puppeteer = require('puppeteer');

const fs        = require('fs');


const GOOGLE_EMAIL    = 'your-bot-account@gmail.com';

const GOOGLE_PASSWORD = 'your-bot-password';

const MEET_URL        = 'https://meet.google.com/abc-defg-hij';

const RECORD_DURATION = 60 * 1000;  // 60 seconds


(async () => {

  const browser = await puppeteer.launch({

    headless: false,

    args: [

      '--no-sandbox',

      '--disable-setuid-sandbox',

      '--use-fake-ui-for-media-stream',

      '--use-fake-device-for-media-stream'

    ]

  });


  const page = await browser.newPage();


  await loginToGoogle(page);

  await joinMeeting(page, MEET_URL);

  await enableCaptions(page);

  await captureTranscript(page, RECORD_DURATION);


  await browser.close();

  console.log('Bot finished.');

})();

Run the bot with:

bash

node bot.js

You should see the browser open, log into Google, navigate to the Meet, join the call, enable captions, and start printing captured text to your terminal while writing it to transcript.txt.

Things That Break and How to Handle Them

Google blocks the login. This happens when Google detects an automated login pattern. Use a real session cookie instead of typing credentials each time. Save cookies after the first manual login and reload them on subsequent runs.

javascript

// Save cookies after login

const cookies = await page.cookies();

fs.writeFileSync('cookies.json', JSON.stringify(cookies));


// Load cookies on next run instead of logging in again

const savedCookies = JSON.parse(fs.readFileSync('cookies.json'));

await page.setCookie(...savedCookies);

The join button selector stops working. Google updates Meet's frontend regularly. When this happens, open a Chrome window, go to the Meet pre-join screen, right-click the Join button, and click Inspect. Find the current attribute or text you can target.

Captions drop words. Google's caption engine paraphrases aggressively and truncates long phrases. If you need high-accuracy transcription, route the audio to a proper ASR engine like Whisper or Deepgram instead of relying on DOM captions.

Conclusion

This bot gives you a working proof of concept. It joins a meeting, reads captions, and saves a transcript. That is enough for internal tools, lightweight summarisation pipelines, and meeting note automation.

What it cannot do easily: it does not capture raw audio, it does not produce speaker-attributed transcripts reliably, and it requires ongoing maintenance every time Google updates their UI. If any of those things matter for your use case, you will spend significant engineering time maintaining selectors and handling edge cases.

For teams who want all of that out of the box without the maintenance overhead, Meetstream.ai is worth a look. It handles bot deployment, real-time transcription with speaker labels, and recording storage through a clean API, so you can focus on building what you actually want rather than keeping up with Google Meet's frontend changes.

Share