Free on-device neural read-aloud, then understand what you hear — summaries, translation, and karaoke on your own API keys, with a live cost estimate before every paid action.
There is a particular kind of silence that settles over a data center at three in the morning. Not the absence of sound — the machines never truly rest — but the absence of intention. The day's traffic has thinned to a trickle, and the systems settle into a rhythm that feels almost meditative.
Rongo never resells synthesis or takes a cut. Drop in a key and you're billed by the provider at their rate — or use the built-in offline engine for nothing.
No copy-paste, no separate app, no dashboard. Rongo lives in the margin of whatever you're reading and only shows up when you ask.
Select any text and a Play widget fades in by your cursor. Or just hover a paragraph — Rongo outlines the reading scope and offers to read it, no selection needed.
Before a single character is sent, you see a live ~$ estimate and the raw character count. Over your cap, the badge turns terracotta and asks before it spends.
The widget becomes a compact player — play, pause, scrub, stop. On Pro, words light up in karaoke as they're spoken, right on the page you're reading.
Select text on any page and hear it instantly in a premium voice — or right-click for a quick native read.
Hover any block of text and a play icon appears with the reading scope outlined — zero selection required.
Swap between ElevenLabs, OpenAI, Google Gemini and the offline engine — with per-provider voice, speed and pitch.
A live ~$ estimate before playback, a high-cost confirmation gate, and an optional monthly spend cap.
Words on the page light up in sync as they're spoken, using ElevenLabs character alignment.
A second action: an English comprehension summary, an optional Polish translation, each playable on demand.
Kokoro + Piper run entirely on your machine — no API key, no cloud round-trip, natural speech for free.
Audio, summaries and translations are cached locally and merged into one History row — export cached audio or comprehension as Markdown/JSON.
API keys live in chrome.storage.local — never synced, never proxied, never seen by us.
A running, lifetime estimated total per provider — and for ElevenLabs, the authoritative characters-used and next invoice.
Rongo is built for people who want to hear a page, actually understand it, and know what they're spending — not another subscription reader.
| Rongo | Speechify | Natural Reader | Read Aloud | |
|---|---|---|---|---|
| Price model | Free read-aloud + one-time Pro ($15 launch); premium voices BYOK at provider rate | Typically monthly subscription for premium voices | Typically subscription or freemium cloud plans | Free (browser Web Speech API) |
| Cost visibility | Live ~$ estimate before every paid action; optional monthly cap | Subscription pricing; per-use cost opaque | Subscription pricing; per-use cost opaque | N/A — always free |
| Data path | Direct browser → your provider; local-neural stays on-device; no Rongo servers | Typically cloud/proxy synthesis | Typically cloud synthesis | On-device (browser voices) |
| Summarise & translate | Pro: comprehension summaries, in-place translation, vocabulary export | Limited / premium-tier features vary by plan | Some products include text tools; not page-scoped History | Read-aloud only |
| Audio export | Export cached audio from History (already synthesized — no re-read) | Often limited on free tiers; varies by plan | Download/export is a core product strength | No export |
Competitor rows describe typical product positioning as of 2026 — check each vendor's current plans. Rongo does not offer PDF/OCR scanning; it reads and understands text you select on the web.
Rongo has no backend. Requests go straight from your browser to the provider you chose. There is no account, no telemetry, and nothing to breach — because there's nowhere for your data to sit.
We never see your text or your keys. Synthesis is a direct call you can verify in the network tab.
The provider charges your account directly at their published rate. Rongo takes nothing per character, ever.
The native Web Speech engine needs no key and no connection — and it never costs a cent.
No subscription. One license, yours for good. (You still pay your own providers for premium synthesis — Rongo just unlocks the software.)
Everything you need to listen to the web, with your own keys or the offline engine.
A single license — validated locally, no call home — that unlocks the comprehension layer for good.
Add Rongo to Chrome, drop in a key, highlight a sentence. You'll hear the difference in about three seconds.