cachly is a persistent AI memory platform for developers. It gives AI coding assistants like Claude Code, Cursor, GitHub Copilot and Windsurf a brain that remembers every lesson, fix, and architecture decision — forever. It connects via the MCP (Model Context Protocol) standard and includes 126 MCP tools. Free tier available. Runs on German (EU) servers.

How does cachly work?

Run 'npx @cachly-dev/mcp-server@latest autopilot' once. The wizard auto-detects every AI editor you have installed (Claude Code, Cursor, Copilot, Windsurf, Cline, Zed) and writes the correct config for each. It then reads your entire git history with brain_from_git and loads years of team knowledge into your Brain before your first session. From that point, sessions start automatically, memory is shared across all your editors simultaneously, and a git post-commit hook teaches cachly from every commit.

Does cachly auto-detect my editors?

Yes. The cachly setup wizard automatically detects Claude Code, Cursor, GitHub Copilot, Windsurf, Cline, Zed, and Continue.dev — any editor that supports MCP. It writes the correct config file for each editor in one pass. You never manually edit JSON config files.

Is memory shared across all my AI editors?

Yes. cachly uses a single Brain that all your AI editors connect to simultaneously. A lesson remembered in Claude Code is instantly available in Cursor and GitHub Copilot. If your team uses different editors, all of you share the same persistent memory pool.

What is brain_from_git?

brain_from_git is a cachly tool that reads your entire git history before your first session and extracts lessons from every commit, PR, and revert. Your AI arrives knowing years of architectural decisions, bug fixes, and team conventions — without you writing a single line of documentation. Zero onboarding.

What is causal_trace?

causal_trace is a cachly tool that traces the history of any file or bug across your entire git history in seconds — replacing 30+ minutes of manual git blame. Describe a problem in plain English. It returns the root cause, the failure chain, and the exact fix that worked — with date, command, and file path.

What is brain_predict?

brain_predict is a cachly tool that scans your Brain for failure patterns before every deploy, migration, or dependency upgrade. It returns probability-weighted warnings based on your team's actual incident history — so you catch the next incident before it happens.

Does cachly work with Claude Code, Cursor, and GitHub Copilot?

Yes. cachly works with Claude Code, Cursor, GitHub Copilot, Windsurf, Cline, Zed, and Continue.dev — anywhere that supports MCP. Run 'npx @cachly-dev/mcp-server@latest autopilot' to configure all editors in one step. Memory is shared across all editors simultaneously.

Can cachly search memory across languages?

Yes. cachly uses semantic vector embeddings, not keyword search. A lesson stored in German appears when you search in English. A fix documented in Arabic matches a Japanese query about the same bug pattern. Supported languages include English, German, French, Spanish, Italian, Portuguese, Japanese, Chinese (Simplified and Traditional), Korean, Arabic, Hebrew, and more.

How is cachly different from mem0?

mem0 is a memory layer for Python LLM apps and chatbots — great for building AI products. cachly is built specifically for developer tooling: it connects to your AI editor via MCP, learns from your git history automatically, predicts failures before deploy, and gives your whole team shared memory. cachly runs on EU servers and is GDPR-native. For developers using Claude Code, Cursor, or Copilot, cachly is the right choice.

Is cachly GDPR compliant?

Yes. cachly runs exclusively on German servers (Hetzner). All data stays in the EU. No data is shared with third parties. cachly is fully GDPR compliant. An AVV (Auftragsverarbeitungsvertrag / Data Processing Agreement) is available for Business and Enterprise customers.

AI Memory for Chinese, Japanese & Korean — How Cachly Handles CJK Without Extra Models

Why CJK breaks most AI memory systems

Most AI memory systems are built around two assumptions that fail completely for Chinese, Japanese, and Korean:

Whitespace tokenization works.CJK languages don't use spaces between words. 我喜欢机器学习 is 7 characters with no spaces — a BPE tokenizer trained on English will split it into suboptimal pieces and produce poor embeddings.
English embedding models generalize.They don't. OpenAI's text-embedding-3-small handles CJK passably, but cosine similarity scores for CJK semantic search are measurably lower than for English. You need a multilingual model like multilingual-e5-large or Alibaba's text-embedding-v3 for production-quality CJK semantic search.

The result: AI assistants used by teams in China, Japan, and Korea have worse memory quality than their English-speaking counterparts. They forget more, recall less accurately, and require manual workarounds like writing lessons in English even for Chinese codebases.

How Cachly's key-based memory avoids the problem

The core of Cachly's AI brain — the learn_from_attempts / recall_best_solution system — doesn't use embeddings at all.

Lessons are stored by a topic key: deploy:kubernetes, fix:chinese-encoding, debug:日本語テスト. The content can be in any language. The key is ASCII by convention but supports full Unicode — topic keys in Chinese or Korean work perfectly.

// Chinese developer workflow — 完全支持中文
learn_from_attempts(
  instance_id = "my-brain-id",
  topic       = "fix:数据库连接",          // Chinese topic key
  outcome     = "success",
  what_worked = "连接池配置: max_idle=10, max_open=50",
  what_failed = "默认配置导致连接泄漏",
  tags        = ["数据库", "连接池", "性能"],
)

// 下次会话自动召回
recall_best_solution("数据库连接池配置")
// → Returns the lesson above, no embedding needed

This works because recall_best_solution does BM25+ full-text search over topic keys and metadata — not semantic embedding similarity. BM25 handles CJK correctly with character n-gram indexing (PostgreSQL's pg_trgm extension covers CJK trigrams natively).

Semantic search for CJK: optional, provider-selectable

For smart_recall — which does use embeddings — Cachly lets you choose the embedding provider per instance:

Provider	Model	CJK Quality	Cost
OpenAI	text-embedding-3-small	Good	~$0.02/1M tokens
Mistral	mistral-embed	Good	~$0.10/1M tokens
Alibaba (Dashscope)	text-embedding-v3	Excellent	~$0.005/1M tokens
Ollama (local)	qwen2-7b-embed	Excellent	Free (local)
Cohere	embed-multilingual-v3.0	Excellent	~$0.10/1M tokens
Gemini	text-embedding-004	Very good	Free tier available

For Chinese developers specifically, the Alibaba Dashscope provider offers the best CJK embedding quality at the lowest cost, and runs entirely on infrastructure inside China (no cross-border data transfer for Chinese compliance requirements).

For Japanese developers, Ollama with qwen2-7b-embed or multilingual-e5-large runs entirely on-premises — useful for organizations with data residency requirements.

Japanese: mixed-script handling

Japanese is uniquely complex — a single sentence may contain Hiragana (ひらがな), Katakana (カタカナ), Kanji (漢字), and Latin (ABC). Standard tokenizers either over-split or under-split.

Cachly stores the content value verbatim as a UTF-8 string. No tokenization happens at storage time. The search layer uses PostgreSQL trigrams (pg_trgm), which operate at the Unicode code point level — so a Japanese query like "認証の問題" correctly matches stored content containing "認証エラー" via shared trigrams, with no language-specific configuration.

# Japanese developer example
learn_from_attempts(
    instance_id = brain_id,
    topic       = "fix:認証エラー",
    outcome     = "success",
    what_worked = "JWT検証でHS256からRS256に変更した",
    what_failed = "HS256は環境変数の秘密鍵が共有できない",
    tags        = ["認証", "JWT", "セキュリティ"],
)

# Works without any embedding model:
recall_best_solution("認証")
# → Returns the lesson above via BM25 trigram matching

Korean: compound morphology

Korean uses agglutinative morphology — a single word can be a combination of a root, subject/object markers, tense, and honorifics. The word "개발했습니다" (developed, formal-polite) won't match "개발" (develop) without morphological analysis.

For key-value brain storage, this doesn't matter — you choose the topic key, and you can normalize it to a root form: fix:데이터베이스 not fix:데이터베이스를연결하는방법.

For semantic search, use Cohere's embed-multilingual-v3.0 — it's trained on Korean morphological patterns and handles compound forms correctly.

Practical setup for CJK teams

# .mcp.json for Chinese/Japanese/Korean teams

{
  "mcpServers": {
    "cachly": {
      "command": "npx",
      "args": ["-y", "@cachly-dev/mcp-server@latest"],
      "env": {
        "CACHLY_JWT": "your-token",
        
        // For semantic search with CJK:
        // Chinese → Alibaba Dashscope (best quality, China-hosted)
        "CACHLY_EMBED_PROVIDER": "dashscope",
        "DASHSCOPE_API_KEY": "sk-...",
        
        // OR Japanese/Korean → Cohere multilingual
        // "CACHLY_EMBED_PROVIDER": "cohere",
        // "COHERE_API_KEY": "...",
        
        // OR fully local with Ollama (any language):
        // "CACHLY_EMBED_PROVIDER": "ollama",
        // "OLLAMA_BASE_URL": "http://localhost:11434",
        // "OLLAMA_MODEL": "qwen2-7b-embed"
      }
    }
  }
}

// No embedding provider needed for learn/recall —
// those work with any language out of the box.

For teams with strict data residency requirements — no data should leave Japan, China, or Korea — the Ollama option runs all embedding generation locally. Cachly's storage is on German servers (EU), but the embedding computation happens on your machine. A fully local Cachly deployment (self-hosted tier) keeps everything on-premises.

AI Memory for Chinese, Japanese & Korean — 中文・日本語・한국어