If you want a bot that joins a Zoom meeting and captures what people say, the first thing you will learn is that Zoom does not offer a simple API for it.
Zoom does have a Meeting SDK, but bots built on it require a feature review that can take four to six weeks before the bot can join external meetings.
The SDK also does not give you the transcript directly, you would need to capture raw audio and route it to a separate transcription service, which adds significant complexity.
The faster and more practical path is browser automation. You script a real Chromium browser to navigate to the Zoom web client, join the meeting as a participant, enable closed captions, and scrape the caption text from the page.
Zoom's closed captions use Zoom's own transcription engine and are completely free. This guide walks through building exactly that, using Node.js and Playwright.
Quick RecapExtract the meeting ID and password from the Zoom URL Build the web client join URL using zoom.us/wc/{meetingID}/join?pwd={password} Launch Chromium with fake media device flags to satisfy Zoom's media checks Fill in the display name on the pre-join screen and click Join Enable captions through the toolbar using the Live Transcript button Poll the captions container every two seconds and write new lines to a file with deduplication Wrap each bot in a Docker container for isolated concurrent runs Handle waiting rooms, selector changes, and caption accuracy as ongoing maintenance tasks
|
What You Need Before Building Zoom Meeting Bot?
- Node.js version 18 or above
- A Zoom meeting link you can test with
- Basic familiarity with JavaScript and async/await
- Docker installed (optional but recommended for running multiple bots cleanly)
How to Build a Zoom Meeting Bot?
Step 1: Set Up Your Project
Create a project folder and initialise it.
bash mkdir zoom-bot cd zoom-bot npm init -y |
Install Playwright and download the Chromium browser it will control.
bash npm install playwright npx playwright install chromium |
Create your main bot file and a simple Express server file.
bash touch bot.js server.js npm install express |
Step 2: Extract the Meeting ID and Password from the Zoom URL
Before the bot can navigate to the Zoom web client, it needs the meeting ID and the meeting password separately. A standard Zoom meeting link looks like this:
https://zoom.us/j/91234567890?pwd=abc123XYZ
You extract both values from that URL.
javascript function parseMeetingUrl(zoomUrl) { const url = new URL(zoomUrl); const meetingId = url.pathname.split('/j/')[1]; const password = url.searchParams.get('pwd') || '';
if (!meetingId) { throw new Error('Could not parse meeting ID from URL.'); }
return { meetingId, password }; } Once you have the meeting ID and password, build the web client join URL like this: javascript function buildWebClientUrl(meetingId, password) { return `https://zoom.us/wc/${meetingId}/join?pwd=${password}`; } |
This URL takes the bot straight to the browser-based join flow without triggering the "Open in Zoom app" dialog.
Step 3: Launch the Browser and Navigate to the Meeting
Start a Chromium browser with flags that fake microphone and camera access. Without these, Zoom will block entry because it cannot detect media devices.
javascript const { chromium } = require('playwright');
async function launchBot(zoomUrl) { const { meetingId, password } = parseMeetingUrl(zoomUrl); const joinUrl = buildWebClientUrl(meetingId, password);
const browser = await chromium.launch({ headless: true, // Set to false while debugging args: [ '--use-fake-ui-for-media-stream', '--use-fake-device-for-media-stream', '--no-sandbox', '--disable-setuid-sandbox', '--disable-gpu' ] });
const context = await browser.newContext({ permissions: ['microphone', 'camera'] });
const page = await context.newPage();
await page.goto(joinUrl, { waitUntil: 'networkidle' }); console.log(`Navigated to: ${joinUrl}`);
return { browser, page }; } |
Step 4: Enter the Display Name and Join
Zoom's web client shows a pre-join screen where a name is required. The bot fills this in and clicks the Join button.
javascript async function joinMeeting(page, displayName = 'Meeting Bot') { // Wait for the name input field await page.waitForSelector('input[placeholder="Your name"]', { timeout: 15000 });
// Clear any pre-filled text and type the bot's display name await page.fill('input[placeholder="Your name"]', displayName);
// Mute microphone before joining so the bot does not transmit audio try { const micBtn = page.locator('button[aria-label*="Mute"]').first(); const isMuted = await micBtn.getAttribute('aria-pressed'); if (isMuted === 'false') { await micBtn.click(); } } catch (e) { // Mic button not found or already muted — continue }
// Click the Join button await page.click('button.preview-join-button');
console.log('Join button clicked. Waiting to enter the meeting...');
// Wait for the in-meeting participant list or toolbar to confirm entry await page.waitForSelector('.meeting-info-container, .footer-button-base__button', { timeout: 30000 });
console.log('Bot has entered the meeting.'); } |
If the meeting has a waiting room enabled, the bot will be held there until the host admits it. The bot will wait on the waiting room screen until the host clicks Admit.
Step 5: Enable Closed Captions
Zoom includes free closed captions powered by its own transcription models. The bot enables them through the meeting toolbar.
javascript async function enableCaptions(page) { // Look for the "Live Transcript" or "Captions" button in the toolbar // Zoom uses different button labels depending on the account settings try { await page.click('button[aria-label="Live Transcript"]', { timeout: 8000 }); } catch (e) { // Try the "CC" button as a fallback await page.click('button[aria-label*="CC"], button[aria-label*="Caption"]', { timeout: 8000 }); }
await page.waitForTimeout(1000);
// Click "Enable Auto-Transcription" from the sub-menu that appears try { await page.click('button:has-text("Enable Auto-Transcription")', { timeout: 5000 }); console.log('Auto-transcription enabled.'); } catch (e) { // Captions may already be active or button text differs console.log('Captions may already be active.'); }
// Wait for the caption panel to appear at the bottom of the meeting await page.waitForSelector('.captions-box', { timeout: 10000 }); } |
One important note: the host's Zoom account must have closed captions enabled in account settings. If the button does not appear, the account admin may need to turn on Live Transcription in the Zoom web portal under Account Settings.
Step 6: Scrape and Save the Transcript
Poll the captions container every two seconds. New text is written to a file and deduplicated so you do not save the same line twice.
javascript const fs = require('fs');
async function captureTranscript(page, outputFile = 'transcript.txt', durationMs = 3600000) { const seen = new Set(); const startTime = Date.now();
console.log('Capturing transcript...');
while (Date.now() - startTime < durationMs) { try { // Grab the text of every caption span currently on screen const captions = await page.$$eval( '.captions-box span, .caption-line span', spans => spans .map(s => s.innerText.trim()) .filter(t => t.length > 2) );
for (const line of captions) { if (!seen.has(line)) { seen.add(line); const entry = `[${new Date().toISOString()}] ${line}`; console.log(entry); fs.appendFileSync(outputFile, entry + '\n'); } } } catch (e) { // The page may be transitioning; continue polling }
await page.waitForTimeout(2000); }
console.log(`Transcript saved to ${outputFile}`); } |
Step 7: Wire It All Together
javascript const { chromium } = require('playwright'); const fs = require('fs');
const ZOOM_URL = 'https://zoom.us/j/91234567890?pwd=yourpassword'; const BOT_NAME = 'Meeting Bot'; const OUTPUT_FILE = 'transcript.txt';
(async () => { const { browser, page } = await launchBot(ZOOM_URL);
await joinMeeting(page, BOT_NAME); await enableCaptions(page); await captureTranscript(page, OUTPUT_FILE);
await browser.close(); console.log('Bot session complete.'); })(); |
Run it with:
Running Multiple Bots with Docker
If you need one bot per meeting, wrap each bot in its own Docker container so they run in isolation without sharing browser state or file handles. Create a simple Dockerfile:
dockerfile FROM mcr.microsoft.com/playwright:v1.44.0-jammy
WORKDIR /app COPY package*.json ./ RUN npm install COPY . .
CMD ["node", "bot.js"] |
Your Express server can then spin up a new container for each meeting URL it receives, and each container writes its transcript to a volume mounted on the host.
Common Problems and How to Fix Them
The Join button selector fails. Zoom updates its web client regularly. If button.preview-join-button stops working, open Zoom in a regular Chrome window, inspect the join button, and find the current class or aria-label. Role-based selectors like page.getByRole('button', { name: 'Join' }) are more stable than class names.
Captions do not appear. The most common reason is that the Zoom account hosting the meeting has not enabled Live Transcription in settings. The host needs to turn this on at zoom.us under Account Settings before the bot can use it.
The bot lands in the waiting room and never enters. Either admit the bot manually from the host controls, or create test meetings with the waiting room turned off by setting "Security" options when scheduling.
Caption text is incomplete. Zoom's caption engine paraphrases and sometimes drops words during fast speech. For verbatim accuracy, consider routing the meeting audio to a dedicated ASR service like Whisper or Deepgram instead of relying on DOM-scraped captions.
Conclusion
This bot works well as a prototype and for internal tooling. When you move to production and need to run dozens of bots concurrently, the workload shifts from writing code to managing infrastructure: scheduling containers, monitoring crashed bots, handling retries, and chasing Zoom web client updates that break your selectors.
If you want all of that handled for you, Meetstream.ai provides a clean API that sends bots to Zoom meetings, returns real-time speaker-attributed transcripts, and stores recordings without you maintaining any browser infrastructure yourself.