Your AI Brain Now Speaks Arabic and Hebrew
The global developer community speaks more than Latin. With cachly v0.5.48, the Brain's search engine natively understands Arabic and Hebrew — two of the world's most widely spoken languages, and historically two of the most underserved in developer tooling.
Cross-Language Bridge
Arabic and Hebrew are now full members of the cross-language retrieval network — the same network that already connects English, German, French, Japanese, Chinese, and Korean. Store a lesson in any language, recall it in any other. No translation. No configuration. No separate index.
Example — a real-world scenario:
An Arabic-speaking engineer fixes a JWT authentication issue and documents it in Arabic:
cachly learn '{
"topic": "fix:auth",
"outcome": "success",
"whatWorked": "مصادقة JWT تعمل بعد إضافة المفتاح السري في متغيرات البيئة"
}'Three days later, a teammate searches in English:
smart_recall("JWT authentication secret missing")
→ ✅ Returns the Arabic lesson, ranked correctlyThe same works in reverse. An English query for port conflict deployment finds Hebrew lessons. A Japanese query finds Arabic lessons. The synonym graph is fully connected.
How It Works — The Synonym Graph
Every technical term is a node. Edges connect equivalents across languages:
"authentication"
↔ مصادقة (Arabic)
↔ אימות (Hebrew)
↔ 認証 (Japanese)
↔ 인증 (Korean)
↔ 认证 (Chinese)
↔ Authentifizierung (German)
"deploy"
↔ نشر (Arabic)
↔ פריסה (Hebrew)
↔ デプロイ (Japanese)
↔ 배포 (Korean)
↔ 部署 (Chinese)
↔ bereitstellen (German)When you query smart_recall("مشكلة النشر") (deployment problem), the Brain: tokenizes → removes Arabic stopwords → stems النشر → نشر → expands to deploy, deployment, 배포, デプロイ, bereitstellen → searches all stored lessons for any of those tokens → returns ranked results regardless of language.
| Query | Finds lessons containing |
|---|---|
| smart_recall("authentication error") | مصادقة, אימות, 認証, autenticación, … |
| smart_recall("مشكلة النشر") | deploy, deployment, デプロイ, 배포, … |
| smart_recall("שגיאת אימות") | auth, authentication, مصادقة, 認証, … |
| smart_recall("تصحيح الأخطاء") | debug, debugging, איתור באגים, デバッグ, … |
The RTL Challenge
Most search engines are built for left-to-right text. Arabic and Hebrew run right-to-left — a surface-level difference that hides a deeper challenge: both languages attach grammatical particles directly to words as prefixes, making naive tokenization nearly useless.
Consider Arabic: the word الخطأ(al-khaṭaʾ, "the error") fuses the definite article ال (al-) with the root خطأ (error). A naive tokenizer treats the whole thing as one opaque token. Searching for خطأ would miss الخطأ — and miss وخطأ (and-error), فالخطأ (so-the-error), and every other prefixed form. Hebrew has the same pattern.
What We Built
Unicode-aware RTL tokenization — when the Brain detects Arabic (U+0600–U+06FF) or Hebrew (U+0590–U+05FF) characters, it switches to word-level tokenization with language-specific enhancements.
Arabic light stemming — iterative, up to 3 passes, resolves stacked prefixes:
الخطأ → خطأ (ال = definite article stripped)
وخطأ → خطأ (و = conjunction stripped)
فالخطأ → خطأ (ف + ال, two passes needed)
للنشر → نشر (ل + ال, two passes: للنشر → النشر → نشر)Plus 60 Arabic + 40 Hebrew stopwords — particles, pronouns, auxiliary verbs, prepositions that carry no semantic weight, filtered before indexing and at query time.
How to Use It
No setup required. Just write lessons the way you think:
# Arabic
cachly learn '{
"topic": "deploy:api",
"outcome": "success",
"whatWorked": "نشر التطبيق نجح بعد تغيير منفذ الخدمة من 8080 إلى 3000"
}'
# Hebrew
cachly learn '{
"topic": "fix:auth",
"outcome": "success",
"whatWorked": "תיקון בעיית האימות על ידי הוספת המפתח הסודי לסביבת הייצור"
}'
# Search in any language
smart_recall("مشكلة النشر") → deployment lessons in any language
smart_recall("שגיאת אימות") → auth error lessons in any language
smart_recall("authentication") → also finds مصادقة and אימות lessonsUpgrade
npx @cachly-dev/mcp-server@latest autopilotcachly is a persistent AI Brain for developers — memory shared across Claude Code, Cursor, GitHub Copilot & Windsurf simultaneously. Auto-detects every editor. Bootstraps from your git history. 115 MCP tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.