Free on-device neural read-aloud — your audio never leaves your machine. Translate any page in place and hear it in your language. Premium AI voices on your own keys, at cost. Pro is a one-time $15 — never a monthly plan.
There is a particular kind of silence that settles over a data center at three in the morning. Not the absence of sound — the machines never truly rest — but the absence of intention. The day's traffic has thinned to a trickle, and the systems settle into a rhythm that feels almost meditative.
Rongo never resells synthesis or takes a cut. Drop in a key and you're billed by the provider at their rate — or use the built-in offline engine for nothing.
No copy-paste, no separate app, no dashboard. Rongo lives in the margin of whatever you're reading and only shows up when you ask.
Select any text and a Play widget fades in by your cursor. Or just hover a paragraph — Rongo outlines the reading scope and offers to read it, no selection needed.
Before a single character is sent, you see a live ~$ estimate and the raw character count. Over your cap, the badge turns terracotta and asks before it spends.
The widget becomes a compact player — play, pause, scrub, stop. On Pro, words light up in karaoke as they're spoken, right on the page you're reading.
Select text on any page and hear it instantly in a premium voice — or right-click for a quick native read.
Hover any block of text and a play icon appears with the reading scope outlined — zero selection required.
Swap between ElevenLabs, Google Gemini and the free on-device engine — with per-provider voice, speed and pitch.
A live ~$ estimate before playback, a high-cost confirmation gate, and an optional monthly spend cap.
Words on the page light up in sync as they're spoken — on every engine, including keyless on-device voices via local Whisper timing.
Highlight a foreign paragraph and the translation swaps in place on the page — formatting kept — then reads aloud with karaoke. Translate a whole page block by block; toggle Original | Translation or restore it all.
A four-part comprehension summary of any selection — and on foreign pages, written directly in your language. Vocabulary captured and exportable.
Kokoro + Piper run entirely on your machine — no API key, no cloud round-trip, natural speech for free.
Audio, summaries and translations are cached locally and merged into one History row — export cached audio or comprehension as Markdown/JSON.
API keys live in chrome.storage.local — never synced, never proxied, never seen by us.
A running, lifetime estimated total per provider — and for ElevenLabs, the authoritative characters-used and next invoice.
Read-aloud tools can't translate the page. Translation tools can't speak. Rongo does both, runs its free voices on your device, and charges once instead of every month.
| Rongo | Speechify | Immersive Translate | Natural Reader | |
|---|---|---|---|---|
| Price | $15 once (Pro) · read-aloud free forever · premium voices BYOK at cost | ~$139 / year | Free tier + ~$8.33 / mo Pro | Subscription / freemium |
| Reads aloud | Yes — free on-device neural voices | Yes (premium voices paid) | No | Yes |
| Translate in place | Yes — format-preserving, multi-block, read aloud | No | Yes | No |
| Audio leaves your device | Never on the free tier — voices & AI run on-device; no Rongo servers | Yes — cloud synthesis | Cloud translation | Typically cloud synthesis |
| Cost visibility | Live ~$ estimate before every paid action; optional monthly cap | Subscription; per-use cost opaque | Subscription / quota | Subscription; per-use cost opaque |
| Account required | No account, ever | Yes | Optional | Yes |
Competitor prices and features describe typical positioning as of 2026 — check each vendor's current plans. Rongo does not offer PDF/OCR scanning; it reads and understands text you select on the web.
Your audio never leaves your device. Rongo's free voices are generated right in your browser, and its AI summaries and translation can run fully offline. There is no backend, no account, no telemetry — nothing to breach, because there's nowhere for your data to sit.
Free neural voices (Kokoro + Piper) and on-device AI (Gemma) generate speech, summaries and translation locally — no cloud round-trip, no key.
Use a premium cloud voice and the call goes straight from your browser to your provider — we never see your text or your keys. Verify it in the network tab.
The provider charges your account directly at their published rate. Rongo takes nothing per character, and never sells a subscription.
No subscription. One license, yours for good. (You still pay your own providers for premium synthesis — Rongo just unlocks the software.)
Everything you need to listen to the web, with your own keys or the offline engine.
A single license — validated locally, no call home — that unlocks the comprehension layer for good.
Add Rongo to Chrome and highlight a sentence — free, no key, no account. You'll hear the difference in about three seconds.