12,000 downloads in 23 days: MCP distribution and the memory pivot
An honest look at the numbers, the distribution channel, and the one repositioning that changed everything.
The numbers
@cachly-dev/mcp-server launched on npm on April 16, 2026. 23 days later: 11,953 downloads in the trailing 30-day window. Peak: 1,944 downloads in a single day (May 6).
That curve didn't come from a launch post — we didn't write one. No Product Hunt, no Reddit, no Show HN. Purely organic.
What we thought would work
We built cachly as a semantic cache for AI applications. The pitch: save LLM API costs by caching similar requests. Fast, EU-hosted, GDPR-compliant. The cache pricing angle was technically correct. It was also easy to defer. Nobody's LLM bill was a problem yet — and even if it was, setting up cache middleware requires a code change. High friction.
What actually drove installs: the pivot to memory
The developers who became power users weren't using cachly as a cache. They were using it as AI memory— storing lessons from debugging sessions, recalling past fixes, persisting context across editor restarts. We'd built memory infrastructure while trying to sell a cache.
The repositioning was blunt: Your AI coding assistant forgets everything at the end of every session. cachly gives it a brain that persists. Installs accelerated from that week onward.
The MCP distribution effect
MCP (Model Context Protocol) is the most efficient developer tool distribution channel we've seen. When you publish an MCP server on npm, you get discovered by:
- Claude's tool ecosystem (developers actively looking for tools)
- Cursor, Windsurf, Cline — all have MCP server discovery built in
- GitHub searches for
mcp server memory,mcp persistent context - “Top MCP servers” lists shared in developer communities
The search intent is already qualified. Someone who finds @cachly-dev/mcp-serverwhile searching for “mcp server memory” has near-zero additional friction to install. One npx command, one config block, done.
What the Brain has learned (meta-lesson)
Here's a sample of what our own Brain instance has stored from running cachly in production:
| Topic | What worked |
|---|---|
| infra:clickhouse-ipv6 | listen_host = 0.0.0.0 + 127.0.0.1 in healthcheck |
| bash:macos-lowercase | tr '[:upper:]' '[:lower:]' not ${var,,} |
| fix:stripe-webhook | express.raw() must come before json() middleware |
| deploy:docker | --force-recreate prevents stale container issues |
| fix:nextauth-session | Middleware must export config with matcher |
| fix:playwright-port | E2E uses port 4000, prod uses 3000 — use env var |
Every new session — on any machine, by any engineer — benefits from these. That's the value proposition in practice.
What's next
23 days and ~12k downloads is a start. The users who benefit most — the ones who've recalled a critical fix that saved an hour of debugging — are the ones who stay and upgrade. The product bet: if the Brain is useful enough, developers will never want to work without it.
cachly is a managed AI Brain for developers — persistent memory, team knowledge sharing, and semantic cache for Claude Code, Cursor, GitHub Copilot & Windsurf. One MCP server. 51 tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.