cachly is a persistent AI memory platform for developers. It gives AI coding assistants like Claude Code, Cursor, GitHub Copilot and Windsurf a brain that remembers every lesson, fix, and architecture decision — forever. It connects via the MCP (Model Context Protocol) standard and includes 126 MCP tools. Free tier available. Runs on German (EU) servers.

How does cachly work?

Run 'npx @cachly-dev/mcp-server@latest autopilot' once. The wizard auto-detects every AI editor you have installed (Claude Code, Cursor, Copilot, Windsurf, Cline, Zed) and writes the correct config for each. It then reads your entire git history with brain_from_git and loads years of team knowledge into your Brain before your first session. From that point, sessions start automatically, memory is shared across all your editors simultaneously, and a git post-commit hook teaches cachly from every commit.

Does cachly auto-detect my editors?

Yes. The cachly setup wizard automatically detects Claude Code, Cursor, GitHub Copilot, Windsurf, Cline, Zed, and Continue.dev — any editor that supports MCP. It writes the correct config file for each editor in one pass. You never manually edit JSON config files.

Is memory shared across all my AI editors?

Yes. cachly uses a single Brain that all your AI editors connect to simultaneously. A lesson remembered in Claude Code is instantly available in Cursor and GitHub Copilot. If your team uses different editors, all of you share the same persistent memory pool.

What is brain_from_git?

brain_from_git is a cachly tool that reads your entire git history before your first session and extracts lessons from every commit, PR, and revert. Your AI arrives knowing years of architectural decisions, bug fixes, and team conventions — without you writing a single line of documentation. Zero onboarding.

What is causal_trace?

causal_trace is a cachly tool that traces the history of any file or bug across your entire git history in seconds — replacing 30+ minutes of manual git blame. Describe a problem in plain English. It returns the root cause, the failure chain, and the exact fix that worked — with date, command, and file path.

What is brain_predict?

brain_predict is a cachly tool that scans your Brain for failure patterns before every deploy, migration, or dependency upgrade. It returns probability-weighted warnings based on your team's actual incident history — so you catch the next incident before it happens.

Does cachly work with Claude Code, Cursor, and GitHub Copilot?

Yes. cachly works with Claude Code, Cursor, GitHub Copilot, Windsurf, Cline, Zed, and Continue.dev — anywhere that supports MCP. Run 'npx @cachly-dev/mcp-server@latest autopilot' to configure all editors in one step. Memory is shared across all editors simultaneously.

Can cachly search memory across languages?

Yes. cachly uses semantic vector embeddings, not keyword search. A lesson stored in German appears when you search in English. A fix documented in Arabic matches a Japanese query about the same bug pattern. Supported languages include English, German, French, Spanish, Italian, Portuguese, Japanese, Chinese (Simplified and Traditional), Korean, Arabic, Hebrew, and more.

How is cachly different from mem0?

mem0 is a memory layer for Python LLM apps and chatbots — great for building AI products. cachly is built specifically for developer tooling: it connects to your AI editor via MCP, learns from your git history automatically, predicts failures before deploy, and gives your whole team shared memory. cachly runs on EU servers and is GDPR-native. For developers using Claude Code, Cursor, or Copilot, cachly is the right choice.

Is cachly GDPR compliant?

Yes. cachly runs exclusively on German servers (Hetzner). All data stays in the EU. No data is shared with third parties. cachly is fully GDPR compliant. An AVV (Auftragsverarbeitungsvertrag / Data Processing Agreement) is available for Business and Enterprise customers.

We built persistent memory for Claude Code

The problem

AI coding assistants are getting remarkably good at writing code. But they have amnesia. Every session you re-explain your architecture, re-describe your deployment process, and watch your AI re-discover the same bugs you fixed two weeks ago.

We measured this across our own development: roughly 40% of our promptswere re-establishing context that existed in a previous session. That's almost half your token budget wasted on things your AI already knew.

The obvious answer is memory. But memory for AI assistants is surprisingly hard to get right. You need it to be fast (no extra round-trips before coding starts), relevant (not just a dump of everything), and structured (lessons, not logs).

What we built

Cachly AI Dev Brain is an MCP server that gives Claude Code (and Cursor, Copilot, Windsurf) persistent memory backed by Valkey and pgvector. The core pattern is two calls:

// At session start — one call replaces 3+ context-setting prompts
session_start(instance_id="...", focus="deploy api")

// → Returns in a single response:
// 📅 Last session (2h ago): Fixed Valkey connection pool, deployed API v0.5
//    Duration: 47 min | Files: api/internal/cache.go, docker-compose.yml
// 🎯 Relevant for "deploy api":
//    ✅🔴 deploy:api — nohup docker compose up -d --build
//    ✅🟡 deploy:rsync-env — rsync overwrites .env, make changes locally first
// 🕐 Recent lessons: bash:macos-lowercase, infra:k3s-tls-san
// 📊 Brain: 23 lessons · 12 context entries · 0 open failures

At session end, Claude saves what it learned:

session_end(
  instance_id    = "...",
  summary        = "Deployed API v0.6, fixed connection pool timeout",
  files_changed  = ["api/internal/cache.go", "docker-compose.yml"],
  lessons_learned = 2,
)

That's the whole loop. Session start briefs Claude. Session end saves what happened. Over time the brain accumulates real, structured knowledge about your codebase.

The hard part: knowledge quality

The naive approach — store everything, retrieve everything — produces noise. After a few weeks you have hundreds of lessons and Claude drowns in irrelevant context.

We solved this with a quality system built into every lesson:

→Severity: critical / major / minor — Claude surfaces critical lessons first, always
→recall_count: every time a lesson helps, its count increments. High-recall lessons rise to the top
→Deduplication: new lessons that conflict with existing ones are compared; the better one wins
→Semantic search: smart_recall() finds relevant lessons by meaning, not keyword matching

The result: after a few weeks of use, session_start() returns a tightly curated briefing. The most critical, most-recalled lessons surface automatically. Noise stays buried.

Real impact: what the numbers look like

Based on our own development on the Cachly codebase, here's what we measured after 30 days of use:

~40%

fewer context-setting prompts

Re-explaining architecture, stack, patterns

~30%

token reduction per session

session_start replaces 3-5 manual context prompts

0×

same bug fixed twice

Critical lessons surface on every relevant session

These numbers will vary by team and codebase size. The more you use it, the better the brain gets. Day 1 is useful. Day 30 is transformative.

Under the hood

The brain runs on infrastructure we already had: Cachly managed Valkey instances. Key schema:

cachly:lesson:best:{topic}      → serialized lesson JSON (severity, recall_count, …)
cachly:session:last             → last session summary + files changed
cachly:session:file_changes     → list of recent git diff --stat entries
cachly:ctx:{category}:{key}     → long-term context (architecture, patterns, …)
cachly:global:lesson:{topic}    → cross-instance global lessons
cachly:public:lesson:{fw}:{t}   → community lessons (Next.js, Go, Docker, …)

All reads are O(1) key lookups or lightweight pattern scans. The session_start() call does 4–6 Valkey reads and returns in under 50ms. There's no LLM involved in retrieval — it's pure key-value with optional pgvector semantic search for smart_recall().

Team Brain and Public Brain

Two features we're especially excited about:

Team Brain — share one instance ID across your engineering team. When one developer learns that k3s requires WireGuard IP in TLS SAN, every AI assistant on the team gets that lesson. Lessons include an author field so you know whose hard-won experience you're benefiting from.

Public Brain — community-curated lessons for popular frameworks. Run import_public_brain("nextjs") and your brain immediately knows about common Next.js gotchas, App Router patterns, and deployment footguns — contributed by the community, anonymized, and deduplicated.

Try it in 30 seconds

npx @cachly-dev/mcp-server@latest autopilot

This detects your editor (Claude Code, Cursor, Copilot, Windsurf, Continue.dev), writes the MCP config, and verifies the brain is reachable. Free tier, no credit card.

Or manually add to your MCP config:

{
  "mcpServers": {
    "cachly": {
      "command": "npx",
      "args": ["-y", "@cachly-dev/mcp-server@latest"],
      "env": {
        "CACHLY_API_URL": "https://api.cachly.dev",
        "CACHLY_JWT": "your-api-token",
        "CACHLY_BRAIN_INSTANCE_ID": "your-instance-id"
      }
    }
  }
}