Teaching AI to Remember: Building a Self-Learning Memory Bank for Coding Assistants
AI coding assistants are powerful but forgetful. Every new session starts from zero. You explain the same architecture, re-describe the same patterns, and re-establish the same constraints. The more complex your codebase, the more painful this becomes.
Memory banks are the first step toward fixing this. This post describes the approach I built at working at Buildertrend, how it differs from Cline's original concept, and what we learned running it across multiple production repos.
The original idea: Cline's memory bank
The memory bank concept was popularized by Cline, a VS Code AI coding assistant. The idea is straightforward: maintain a set of markdown files that describe your project, and instruct the AI to read them at the start of each session and update them as it works.
Cline's memory bank uses six core files, all in a flat directory:
memory-bank/
projectBrief.md -- what the project is
productContext.md -- why it exists, who uses it
techContext.md -- tech stack, dependencies, setup
systemPatterns.md -- architecture, conventions
activeContext.md -- what you're working on now
progress.md -- what's done, what's left
The AI reads these at session start via custom instructions. When it finishes work, it updates the relevant files. The user can also trigger updates manually.
This is a genuine improvement over having no persistent context at all. But after adopting it across several repos, we found it left important things on the table.
What I changed: short-term and long-term memory
Cline treats all six files the same way. They sit in one directory and get read together. But in practice, these files have very different lifespans. techContext.md (your tech stack) changes rarely. activeContext.md (what you're working on right now) changes every session. Treating them identically creates two problems: stable knowledge gets accidentally overwritten during active work, and current state bleeds into files that should be permanent reference material.
I split the memory bank into two tiers:
_memory_bank/
short-term-memory/
projectBrief.md -- current project scope
activeContext.md -- what's happening right now
progress.md -- status and known issues
long-term-memory/
productContext.md -- why the project exists
systemPatterns.md -- architecture and patterns
techContext.md -- tech stack and setup
This is not just organizational. The directory structure encodes a rule: short-term files change frequently and get cleared between projects. Long-term files accumulate knowledge over time and rarely need wholesale updates.
Automated learning: the end-project workflow
The most important difference from Cline's approach is how knowledge flows between the two tiers.
In Cline's system, updating the memory bank is a manual process. The user says "update memory bank" and the AI reviews the files. There is no structured mechanism for promoting learnings from active work into permanent knowledge. Whatever the AI writes during a session stays where it was written, and cleaning it up is the user's responsibility.
I built an explicit learning workflow into the memory bank instructions. When a project ends, the AI follows a defined sequence:
- Review all memory bank files
- Add information learned during the project to short-term memory files, merging with existing content
- Move valuable learnings from short-term files to the appropriate long-term files
- Clear short-term files to prepare for the next project
This is the "end project" command. It forces a structured knowledge transfer. Patterns discovered during active development get promoted to systemPatterns.md. Technical setup learned through trial and error gets added to techContext.md. The active context and progress files get cleaned out so the next project starts fresh, but the knowledge persists.
The prompt instructions encode this explicitly:
Move any learnings from short-term memory files (progress.md, activeContext.md, projectBrief.md) to long-term memory files (systemPatterns.md, productContext.md, techContext.md) as appropriate.
This matters because AI assistants are bad at retroactive organization. They are good at following structured workflows in the moment. By embedding the learning transfer into a defined command, we get reliable knowledge promotion instead of hoping the AI decides to do it on its own.
How context loading works
I configured the memory bank through an agents.md file that the AI reads at session start. Trigger phrases activate different behaviors:
- "read the memory bank" or "check the memory bank" loads all files
- "update memory bank" triggers a comprehensive review of all files
- "end project" triggers the short-to-long-term learning transfer
The AI reads short-term files first (to understand current state), then long-term files (to understand stable context). This loading order means the AI always has both the "where are we now" and the "how things work here" layers.
What I learned running it in production
After several months of use across our dbt and Databricks Asset Bundles repos, some patterns emerged.
The short-term/long-term split prevents context rot. Before the split, systemPatterns.md would accumulate session-specific notes that became irrelevant within days. With the split, temporary observations go into activeContext.md and only get promoted to long-term files through the explicit end-project workflow. Long-term files stay clean.
The end-project workflow is the most valuable feature. Without it, learnings die with the session. With it, every project leaves behind durable knowledge. Our systemPatterns.md grew from a skeleton to a comprehensive reference that new team members' AI sessions could immediately use.
The trigger phrase model is the weakest link. The entire system depends on the AI reading instructions and the user saying the right phrases. Sometimes the AI skips reading the memory bank. Sometimes the user forgets to say "end project" and learnings are lost. The system works, but it's not reliable in the way that automated systems are reliable.
Flat structure doesn't scale. Our systemPatterns.md grew to 500+ lines covering everything from DLT pipeline patterns to MongoDB replication to incremental loading strategies. Loading the entire file into context for every session wastes tokens and dilutes signal. When working on a specific area of the codebase, most of that context is irrelevant.
Team scaling requires discipline. Every developer's AI session needs to independently follow the same instructions. There is no enforcement. One developer who never says "end project" creates a knowledge gap for everyone.
Where this leads
The memory bank with short-term/long-term tiers and automated learning transfer is a meaningful improvement over Cline's flat approach. The explicit knowledge promotion workflow is the key differentiator. It turns the memory bank from a static reference document into a learning system that gets smarter over time.
But the fundamental limitation remains: it's instruction-driven. It works when the AI cooperates. It fails silently when it doesn't. The trigger phrases are a workflow tax. The flat file structure doesn't scale.
These limitations led us to build the Librarian, a hook-driven system that replaces instructions with automation and flat files with hierarchical, repo-mirrored context. Stay tuned for that post.
Comments ()