Search Your Japanese Brain in Romaji
You store lessons in Japanese. You search in whatever flows from your fingers — sometimes Japanese, sometimes romaji. With cachly v0.5.37, both work.
The Problem
A developer on a Japanese team stores lessons in Japanese every day:
learn_from_attempts:
topic: "deploy:k8s"
whatWorked: "readinessProbeのfailureThresholdを10に増やすことで解決"Three days later, they try to recall it. They type fast, without switching input method:
smart_recall("readiness probe failure threshold") → ✅ found
smart_recall("depuroi") → ❌ nothing (before v0.5.37)
smart_recall("デプロイ") → ✅ foundThe lesson is there. But if you search in romaji — the Latin-alphabet transliteration of Japanese used when typing without a Japanese input method — it was invisible.
What We Built
Katakana → romaji tokenization at index time.
When cachly indexes a document containing katakana, it now also generates the Hepburn romaji equivalent as an additional search token.
"コンテナのヘルスチェックで問題発生"
↓ bigrams: コン, ンテ, テナ, ...
↓ romaji tokens: "kontena", "herusuche" (from ヘルス+チェック)Now a search for kontena matches the stored katakana content.
The Converter
We implemented a full Hepburn romanization converter — 70+ mapping rules — handling:
| Katakana | Romaji | Notes |
|---|---|---|
| コンテナ | kontena | Basic consonant-vowel |
| デプロイ | depuroi | Voiced consonants |
| チェック | chekku | Digraph チェ + geminate ッ |
| ファイル | fairu | Loanword combination ファ |
| サーバー | saba | Long vowel mark ー stripped |
| シャットダウン | shattodaun | Digraph + geminate |
Real Examples
# Stored lesson (Japanese)
"コンテナのヘルスチェックで127.0.0.1を使用する必要がある"
# These all find it now:
smart_recall("kontena") → ✅ (romaji for コンテナ)
smart_recall("container") → ✅ (cross-lingual synonym)
smart_recall("コンテナ") → ✅ (direct katakana match)
smart_recall("healthcheck 127") → ✅ (mixed language, Latin terms match directly)# Stored lesson (katakana deployment note)
"デプロイメントが失敗、ポート3000を開放することで解決"
smart_recall("depuroimento") → ✅ (full romaji)
smart_recall("depuroi") → ✅ (partial romaji, fuzzy match)
smart_recall("port 3000") → ✅ (Latin terms match directly)
smart_recall("deploy") → ✅ (cross-lingual: deploy↔デプロイ)Why Romaji Matters
Japanese developers switch input modes constantly. When typing fast in a terminal, in a commit message, or in a search query — romaji often comes out first. Making romaji search work means zero friction between "I need to find that lesson" and "found it."
This is especially true for katakana loanwords — the vast majority of technical terms in Japanese are written in katakana and have direct romaji equivalents.
Upgrade
npx @cachly-dev/mcp-server@latest autopilotcachly is a persistent AI Brain for developers — memory shared across Claude Code, Cursor, GitHub Copilot & Windsurf simultaneously. Auto-detects every editor. Bootstraps from your git history. 115 MCP tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.