Memory Crystals: distilling team knowledge
into instant AI context
A Brain with a hundred lessons is powerful — but loading all of them into every session is slow and noisy. Memory Crystals are our solution: a dense, pre-distilled snapshot of your team's most important knowledge, ready to inject the moment a session starts.
The scaling problem with AI memory
When we first built the AI Brain, we were focused on a simple problem: AI assistants don't remember anything between sessions. The fix was straightforward — store lessons, retrieve them at session start.
But as teams use the Brain over months, a new problem emerges. The Brain gets large. Not unusably large — a well-curated Brain with hundreds of lessons is genuinely valuable. But you can't inject hundreds of lessons into every session start. Context windows have limits, and beyond a certain size, the noise starts to overwhelm the signal.
The naive solution — just surface the top N lessons by severity and recall frequency — works up to a point. But it misses something: the lessons aren't independent. They have relationships, themes, and patterns that aren't visible when you look at them one by one. A good briefing isn't a list of facts — it's a coherent picture.
What a Crystal is
A Memory Crystal is a distillation of the entire Brain — or a subset of it — into a single dense document. Not a summary in the "here are the main points" sense, but a synthesized knowledge artifact that captures the patterns, the critical rules, the architectural decisions, and the hard-won lessons in a form optimized for rapid AI consumption.
Think of it like the difference between a textbook and a cheat sheet written by someone who has aced the exam. The textbook has everything; the cheat sheet has what actually matters, organized for fast recall under pressure.
Crystals are created explicitly — you decide when the Brain has enough accumulated knowledge to be worth distilling. Once created, a Crystal is injected at the very start of every session, before any task-specific context is loaded. Every AI assistant on your team starts every session pre-loaded with the Crystal.
The token economics
Context window space is a real constraint. Every token you spend loading context is a token you're not spending on actual work. This is why naive memory approaches fail at scale — they're not efficient with context.
Crystals are specifically optimized for token density. The distillation process isn't just compression — it's restructuring. Redundant lessons are merged. Contradictory lessons are resolved (the more recent, higher-severity version wins). Related lessons are grouped so the AI can process them as a unit rather than as isolated facts.
Freshness: when to re-crystallize
A Crystal is a snapshot. As your team keeps learning and adding to the Brain, the Crystal gets stale. The lessons it doesn't yet contain are still available via normal retrieval — but they aren't in the pre-loaded context.
We solved this with two mechanisms. First, brain_doctor reports Crystal age and estimates freshness based on how many new lessons have been added since the last crystallization. When freshness drops below a threshold, it surfaces a recommendation.
Second, re-crystallization is cheap and fast. It's not a heavy operation that requires careful timing — you can re-crystallize as often as makes sense for your team's pace. Weekly is a reasonable cadence for most teams; daily is fine for fast-moving projects.
Scoped Crystals for large teams
As organizations get larger, a single Brain Crystal covers too much ground. A backend developer joining an incident investigation doesn't need the Crystal loaded with frontend styling conventions.
Crystals can be scoped by topic prefix — so the backend team might maintain a Crystal covering infrastructure, deployment, and API lessons, while the frontend team maintains one covering build tooling, performance patterns, and design system conventions. At session start, the right Crystal loads based on the session's declared focus area.
This scoping also helps with onboarding: new team members can choose to load a broader Crystal that covers the whole system at a higher level, then narrow to scoped Crystals as they specialize.
The analogy that helped us design it
When we were designing Crystals, we kept coming back to how expert developers think. A senior engineer who has been on a codebase for three years doesn't consciously recall every past incident before making a decision. They have pattern-matched, condensed versions of those experiences sitting in working memory — immediately accessible, highly compressed, ready to apply.
That's what a Crystal is for an AI assistant. Not a filing cabinet to search through, but a loaded mental model. The AI arrives at each session with the team's accumulated wisdom already active, in the same way a senior developer carries years of experience into every code review.
The Brain stores everything. The Crystal makes the most important parts of everything instantly available. Together, they give AI assistants something closer to genuine expertise — not just recall, but judgment.
cachly is a managed AI Brain for developers — persistent memory, team knowledge sharing, and semantic cache for Claude Code, Cursor, GitHub Copilot & Windsurf. One MCP server. 51 tools. Free tier, EU servers, no credit card.
Your AI is forgetting everything right now.
Every session starts blank. Every bug re-discovered. Every deploy procedure re-explained. cachly fixes that in 30 seconds — your AI remembers every lesson, every fix, every teammate's hard-won knowledge. Forever.