Store in Japanese, Recall in English — Cross-Language AI Memory
Your AI Brain just got multilingual retrieval. Lessons stored in Japanese, Chinese, Korean, Arabic, or Hebrew are automatically findable in English — and vice versa. No embeddings required.
The Use Case
International engineering teams often have a language problem. The Japanese team stores lessons in Japanese. The Korean team writes in Korean. The shared AI memory is siloed — not because the Brain doesn't know these languages, but because keyword search can't bridge language boundaries. Until now.
How It Works — Without Embeddings
Semantic search with embeddings can find cross-lingual matches, but it requires an OpenAI API key, costs money per lookup, and adds latency. Most Brain users are on the free tier — no API key, no embeddings.
Our approach: a curated technical term synonym map, built directly into the tokenizer. When any document is indexed or any query is tokenized, every recognized technical term gets expanded to its equivalents in all supported languages:
Token: "deploy"
Expanded to: デプロイ (JA), 部署 (ZH), 배포 (KO), نشر (AR), פריסה (HE)
+ bigrams of all CJK variants added to the token stream
Token: デプロイ
Expanded to: "deploy", "deployment"
+ romaji "depuroi" added from katakana converterThe expansion happens at tokenize time — both when indexing (documents get synonym tokens) and when searching (queries get synonym tokens). This creates a shared token space across all 6 language pairs.
The Synonym Map — 130+ Terms
The map covers the technical vocabulary that actually appears in developer Brain entries:
| Concept | JA | ZH | KO | AR | HE |
|---|---|---|---|---|---|
| deploy | デプロイ | 部署 | 배포 | نشر | פריסה |
| container | コンテナ | 容器 | 컨테이너 | حاوية | מיכל |
| server | サーバー | 服务器 | 서버 | سيرفر | שרת |
| error | エラー | 错误 | 오류 | خطأ | שגיאה |
| auth | 認証 | 认证 | 인증 | مصادقة | אימות |
| monitor | モニター | 监控 | 모니터링 | مراقبة | ניטור |
Plus 100+ more: cache, database, build, test, install, log, port, cluster, debug, migration, and more.
Zero-Embedding, Zero-Cost
The entire cross-language lookup is a Map.get() call — O(1), no API, no network, no cost.
Traditional semantic search: 200–400ms latency, $0.0004/1000 tokens
Cross-lingual synonym lookup: <0.01ms latency, $0This runs on the free tier, on every smart_recall call, in every Brain session.
Real-World Example
Team setup: Korean backend team stores lessons in Korean. English-speaking DevOps engineers query the Brain in English.
# Korean engineer stores a lesson
learn_from_attempts:
topic: "deploy:k8s"
outcome: "success"
whatWorked: "배포 실패 원인: 포트 3000이 방화벽에 의해 차단됨. 포트를 열어 해결"
# English DevOps searches the next day
smart_recall("deployment failure port blocked")
→ ✅ Returns the Korean lesson, ranked first# Symmetric: English lesson found by Korean query
learn_from_attempts:
topic: "fix:redis"
outcome: "success"
whatWorked: "Redis connection timeout fixed: set timeout to 5000ms in config"
smart_recall("레디스 연결 오류") # Korean: "Redis connection error"
→ ✅ Returns the English lessonSupported Languages
The synonym graph now covers 6 language families: Japanese (hiragana + katakana), Chinese (simplified), Korean (hangul), Arabic (MSA technical vocabulary), Hebrew, and English. All language pairs are bidirectional.
Cross-language search activates automatically — no configuration, no flag, no API key. If you store in Japanese and recall in English, it just works.
Upgrade
npx @cachly-dev/mcp-server@latest autopilotcachly is a persistent AI Brain for developers — memory shared across Claude Code, Cursor, GitHub Copilot & Windsurf simultaneously. Auto-detects every editor. Bootstraps from your git history. 115 MCP tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.