A rhyming dictionary that works the way rhyme actually works: on sound. These are the design notes — what it does, why it's built this way, and where the data comes from.
Rhyme lives in pronunciation, and English spelling is a famously unreliable witness to it:
colonel rhymes with kernel, through with threw, and cough,
though, and tough agree on almost nothing. Letter-based rhyme tools inherit all
of that noise. Technical Rhymer instead searches over phonemes — the actual sounds —
written in ARPABET, the notation used by the CMU Pronouncing Dictionary:
orange is AO1 R AH0 N JH.
Digits on vowels mark stress: 1 primary, 2 secondary,
0 unstressed. A perfect rhyme is "same sounds from the last stressed vowel to the
end", which is exactly a suffix match on the phoneme string — that one observation is
the whole design. Look a word up, take the tail you care about, and search it.
Every entry is indexed by its phoneme string, and a search scans all of them (a linear pass
over ~145k pronunciations takes tens of milliseconds — no clever index needed). The match
modes anchor the fragment: Ends with is the rhyme case, Starts with finds
alliterative twins, Anywhere and Exact do what they say. Two operators build
richer patterns: * requires parts in order with anything between
(AO R * JH), and | requires parts in any order. Listing a part
twice (B | B) demands it occur twice.
Stress is ignored by default: requiring stress digits to match eliminates most slant
rhymes people actually want, so exact-stress matching is the opt-in, not the default.
Double rhymes — words where the searched segment occurs twice or more, like
rat-a-tat for AE T — get split into their own section with a
×N badge, since a repeated rhyme is usually the better find.
Fuzzy matching (opt-in) treats like-sounding consonants as interchangeable for slant
rhymes. The classes are grouped by manner and voicing of articulation — so
T↔K match but T↔D (a voicing
change) do not:
You never have to type ARPABET cold: looked-up pronunciations drop into the search box with a click, results highlight exactly which sounds matched, and a tap-to-build phoneme keyboard under the search box shows every sound with an example word.
| Voiceless stops | P T K |
| Voiced stops | B D G |
| Voiceless sibilants | S SH CH |
| Voiced sibilants | Z ZH JH |
| Voiceless fricatives | F TH |
| Voiced fricatives | V DH |
| Nasals | M N NG |
| Liquids | L R |
| Glides | W Y |
Vowels are never fuzzed — vowel identity is most of what makes a rhyme feel like one.
Those are the defaults, not dogma: the gear next to the Fuzzy toggle opens an editor where you can regroup the consonants however your ear likes — merge voicings, split the sibilants, set a sound loose so it only matches itself. Custom groups persist in your browser, and Reset to defaults brings back the table above.
A rhyme search for a short tail can return thousands of words, and most of them are words nobody would use. Results are therefore ranked by commonality — the wordfreq project's Zipf scale, where ~7 is the and ~1 is deep obscurity. The thin bar under each result is that score. Usable rhymes surface first; the exotic tail is still there at the bottom. Sorting by syllable count or alphabetically is one click away.
Rhyming is usually rhyming about something. The optional filter keeps only results
related to a word you name — searching EH ZH ER while writing about pirates
keeps treasure. It has two engines:
Three layers, each honestly labeled in the UI:
T IY1 D IY1 EH1 S). The source
dataset ends in November 2023, so the newest of the new isn't here yet.
The site is static files — no database, no framework, no build step at runtime. All
dictionary data (~15 MB across six bundles) is baked into plain JavaScript files
loaded with <script> tags, which means search works offline and even
opened directly from disk over file://. The heavy word-vector and WordNet
bundles are lazy-loaded only if the offline sense filter is actually used. The single
exception to "no server" is the word-sense relay described above — a fixed-function
endpoint that exists purely so the AI key never ships to your browser.
Data files are regenerated by small Python scripts in build/ (CMUdict parsing,
wordfreq scores, GloVe quantization, WordNet extraction, Urban Dictionary ranking and
grapheme-to-phoneme). The runtime never depends on them. Pinned result tabs and your
preferences (fuzzy matching, stress handling, sort, match mode, panels) persist in
localStorage only — no cookies — and stick across visits. Searches also sync
into the address bar (?q=AH+N+JH), so any query is shareable as a plain link;
opening someone else's link never overwrites your own preferences.
No accounts, no analytics, no cookies, no tracking. Nothing you type leaves your machine, with one explicit exception: when you use the word-sense filter's AI mode, the filter word and the candidate rhyme list (just words — never your searches or anything else) are sent to our relay and on to Anthropic to be judged. They aren't tied to any identity, and there's nothing else to send — the site has no accounts to associate them with.
georgiyozhegov/urbandictionary-raw dataset.