guide

How to Convert Speech to Text Online for Free (No App, No Upload)

By Rui Barreira · Last updated: 13 June 2026

You can convert speech to text for free using brevio Speech to Text — click Start Recording, speak, and see your words transcribed live in your browser. No app, no upload, no API key. Works in Chrome, Edge, and Safari.

The Web Speech API gives browsers access to the device's built-in speech recognition engine without any server round-trip. On Windows, this uses the Windows Speech Recognition engine. On macOS and iOS, it uses Siri's speech engine. On Android and Chrome OS, it uses Google's speech recognition API. The practical implication: audio processing happens at the OS level, not on brevio's servers, and no recording is stored after transcription completes.

How the Web Speech API Works

The Web Speech API's SpeechRecognition interface streams audio from the device microphone to the speech recognition engine. The engine returns results as partial (isFinal: false) and final (isFinal: true) results. Partial results show the engine's current best guess in real time. Final results are confirmed transcriptions that are appended to the growing transcript.

With continuous: true and interimResults: true, the transcript grows naturally as you speak. Pauses between phrases trigger final results. The recogniser will automatically stop if you pause too long (typically 5–10 seconds) — clicking Start Recording again resumes without clearing the existing transcript.

Browser Support

Browser	Web Speech API	Notes
Chrome (desktop)	Full support	Uses Google speech API on all platforms
Chrome (Android)	Full support	Uses Google speech API
Edge	Full support	Uses Microsoft Azure Speech at OS level
Safari (macOS)	Supported (webkit prefix)	Uses Siri/macOS speech engine
Safari (iOS)	Supported	Uses Siri; requires user permission
Firefox	Partial / off by default	Flag-gated in recent versions; not reliable
Samsung Internet	Limited support	Varies by version

Accuracy Expectations

Accuracy depends heavily on: microphone quality, background noise, accent, and language. Under ideal conditions (quiet room, good microphone, standard accent), English recognition accuracy is typically 95–98%. Non-native accents reduce accuracy to 85–92%. Non-English languages vary significantly — Spanish and French perform well, Mandarin Chinese requires precise tones, Japanese performs well due to the phonetically consistent writing system.

The recogniser handles continuous speech better than isolated words. Speaking in complete sentences with natural cadence produces better results than speaking slowly word by word. Punctuation is not automatically inserted — you'll need to add commas and periods manually in the transcript.

Language Support

brevio Speech to Text supports 8 languages: English (US), English (UK), Spanish, French, German, Portuguese, Chinese (Mandarin, simplified), and Japanese. The underlying Web Speech API supports many more languages — these 8 were selected for broad coverage and reliable engine support across Chrome, Edge, and Safari.

Privacy: What Happens to Your Audio

This is the most common question for voice tools. With the Web Speech API on Chrome, audio is streamed to Google's servers for speech recognition — it is not processed locally. The audio stream is not stored or used for training according to Google's API terms. On Edge, audio may go through Microsoft Azure Speech. On Safari, audio is processed by the macOS/iOS Siri engine, which has different privacy characteristics.

For fully local speech recognition (no audio leaves the device), alternatives include: OpenAI Whisper (self-hosted via whisper.cpp or whisperd), the new Chrome AI API (when available), or native OS dictation features. brevio uses the Web Speech API because it's the only option that doesn't require server infrastructure on our end — but users with strict audio privacy requirements should use a local Whisper setup instead.

Comparison: Speech-to-Text Tools

Tool	Upload?	Account?	Languages	Cost
brevio Speech to Text	No (OS engine via Web Speech API)	No	8	Free
Otter.ai	Yes — server upload	Yes (required)	English only	Free (600 min/mo), from $16.99/mo
Google Docs Voice Typing	No (Web Speech API)	Yes (Google account)	90+	Free with Google account
Whisper (self-hosted)	No — local GPU/CPU	No	99+	Free, open source

Frequently Asked Questions

Why does the recording stop automatically?

The Web Speech API stops recognition after a silence threshold (typically 5–10 seconds without speech). This is a browser-imposed limit on the API. Click Start Recording again to resume — your existing transcript is preserved. For continuous long recordings, you may need to click Start Recording several times.

Does it add punctuation automatically?

No. The Web Speech API returns words only — punctuation is not automatically inserted. Some implementations on certain platforms may add basic punctuation for well-trained models, but this is not guaranteed. You'll need to add commas, periods, and paragraph breaks manually in the transcript.

Can I export the transcript to a file?

Copy the transcript using the Copy button and paste it into any text editor, document, or email. For saving directly as a file, the browser doesn't provide a built-in save-to-txt from the clipboard — paste into Notepad or TextEdit and save from there.

My microphone is not being detected — what should I do?

Check that you've granted microphone permission when the browser prompted you. If you denied permission, go to your browser settings → Site Settings → Microphone → reset permission for this site. On macOS, check System Preferences → Security & Privacy → Privacy → Microphone and ensure your browser is listed and enabled.

Frequently Asked Questions

Why does the recording stop automatically?: The Web Speech API stops after 5–10 seconds of silence. Click Start Recording again to resume — your existing transcript is preserved.
Does it add punctuation automatically?: No. The Web Speech API returns words only. Add commas, periods, and paragraph breaks manually in the transcript.
Can I export the transcript to a file?: Copy the transcript and paste into any text editor, then save from there. There is no direct browser save-to-file from the Speech API output.
My microphone is not being detected — what should I do?: Grant microphone permission when prompted. If denied, go to browser settings → Site Settings → Microphone → reset permission. On macOS, check System Preferences → Security & Privacy → Microphone.