Your AI Brain Now Speaks Arabic and Hebrew
The global developer community speaks more than Latin. With cachly v0.5.48, the Brain's search engine natively understands Arabic and Hebrew — two of the world's most widely spoken languages, and historically two of the most underserved in developer tooling.
Cross-Language Bridge
Arabic and Hebrew are now full members of the cross-language retrieval network — the same network that already connects English, German, French, Japanese, Chinese, and Korean. Store a lesson in any language, recall it in any other. No translation. No configuration. No separate index.
Example — a real-world scenario:
An Arabic-speaking engineer fixes a JWT authentication issue and documents it in Arabic:
cachly learn '{
"topic": "fix:auth",
"outcome": "success",
"whatWorked": "مصادقة JWT تعمل بعد إضافة المفتاح السري في متغيرات البيئة"
}'Three days later, a teammate searches in English:
smart_recall("JWT authentication secret missing")
→ ✅ Returns the Arabic lesson, ranked correctlyThe same works in reverse. An English query for port conflict deployment finds Hebrew lessons. A Japanese query finds Arabic lessons. The synonym graph is fully connected.
How It Works — The Synonym Graph
Every technical term is a node. Edges connect equivalents across languages:
"authentication"
↔ مصادقة (Arabic)
↔ אימות (Hebrew)
↔ 認証 (Japanese)
↔ 인증 (Korean)
↔ 认证 (Chinese)
↔ Authentifizierung (German)
"deploy"
↔ نشر (Arabic)
↔ פריסה (Hebrew)
↔ デプロイ (Japanese)
↔ 배포 (Korean)
↔ 部署 (Chinese)
↔ bereitstellen (German)When you query smart_recall("مشكلة النشر") (deployment problem), the Brain: tokenizes → removes Arabic stopwords → stems النشر → نشر → expands to deploy, deployment, 배포, デプロイ, bereitstellen → searches all stored lessons for any of those tokens → returns ranked results regardless of language.
| Query | Finds lessons containing |
|---|---|
| smart_recall("authentication error") | مصادقة, אימות, 認証, autenticación, … |
| smart_recall("مشكلة النشر") | deploy, deployment, デプロイ, 배포, … |
| smart_recall("שגיאת אימות") | auth, authentication, مصادقة, 認証, … |
| smart_recall("تصحيح الأخطاء") | debug, debugging, איתור באגים, デバッグ, … |
The RTL Challenge
Most search engines are built for left-to-right text. Arabic and Hebrew run right-to-left — a surface-level difference that hides a deeper challenge: both languages attach grammatical particles directly to words as prefixes, making naive tokenization nearly useless.
Consider Arabic: the word الخطأ(al-khaṭaʾ, "the error") fuses the definite article ال (al-) with the root خطأ (error). A naive tokenizer treats the whole thing as one opaque token. Searching for خطأ would miss الخطأ — and miss وخطأ (and-error), فالخطأ (so-the-error), and every other prefixed form. Hebrew has the same pattern.
What We Built
Unicode-aware RTL tokenization — when the Brain detects Arabic (U+0600–U+06FF) or Hebrew (U+0590–U+05FF) characters, it switches to word-level tokenization with language-specific enhancements.
Arabic light stemming — iterative, up to 3 passes, resolves stacked prefixes:
الخطأ → خطأ (ال = definite article stripped)
وخطأ → خطأ (و = conjunction stripped)
فالخطأ → خطأ (ف + ال, two passes needed)
للنشر → نشر (ل + ال, two passes: للنشر → النشر → نشر)Plus 60 Arabic + 40 Hebrew stopwords — particles, pronouns, auxiliary verbs, prepositions that carry no semantic weight, filtered before indexing and at query time.
How to Use It
No setup required. Just write lessons the way you think:
# Arabic
cachly learn '{
"topic": "deploy:api",
"outcome": "success",
"whatWorked": "نشر التطبيق نجح بعد تغيير منفذ الخدمة من 8080 إلى 3000"
}'
# Hebrew
cachly learn '{
"topic": "fix:auth",
"outcome": "success",
"whatWorked": "תיקון בעיית האימות על ידי הוספת המפתח הסודי לסביבת הייצור"
}'
# Search in any language
smart_recall("مشكلة النشر") → deployment lessons in any language
smart_recall("שגיאת אימות") → auth error lessons in any language
smart_recall("authentication") → also finds مصادقة and אימות lessonsUpgrade
npx @cachly-dev/mcp-server@latest setupcachly is a managed AI Brain for developers — persistent memory, team knowledge sharing, and semantic cache for Claude Code, Cursor, GitHub Copilot & Windsurf. One MCP server. 51 tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.