● lab 07 | ~10 min | masterclass

There is not one memory. There are four, and a real agent blends them.

Lab 06 gave you one durable fact store. But a working agent runs on four different kinds of memory at once, and the most common bug is mixing them up. This lab names the four. It puts each one in the right store. Then it composes them into the prompt the agent reads on every turn. Working state goes in a scratchpad the agent re-reads. What happened goes in a log you can replay. Durable facts go in a guarded store. The agent's behavior rules go in a file loaded every turn. That blend is what the VCN (Vibe Coding Nights) #33 workshop called the hybrid stack: the four stores wired together as one memory system. The blend is the product.

step 1

Name the four. One store per type.

Four kinds of memory. Four jobs. Four stores. Working memory is the current task's live state: a scratchpad, or a to-do list. It is small and short-lived. You re-read it on every turn so the model does not forget what it is in the middle of. Episodic memory is what happened: a log of past runs that you only ever add to, and can replay later. Semantic memory is durable facts about the world. That is the Lab 06 store. In a bigger system this is a vector database (a search index that finds text by meaning, not just keywords) or a tool like Mem0. Procedural memory is how the agent behaves: rules that do not change from one task to the next, loaded every turn from a file. A CLAUDE.md is exactly this. Every store here uses only the Python standard library, the modules that ship with Python so you install nothing. Save the file as memory_mix.py.

memory_mix.py

import json
from dataclasses import dataclass, field
from pathlib import Path

ROOT = Path("mix")          # everything lives under one folder for the lab
ROOT.mkdir(exist_ok=True)

EPISODIC = ROOT / "episodes.jsonl"   # what happened: append-only log
SEMANTIC = ROOT / "facts.json"       # durable facts: the Lab 06 shape, by name
RULES    = ROOT / "rules.md"         # procedural: how the agent behaves

# Seed a procedural rules file once. In a real repo this is your CLAUDE.md;
# it is loaded every turn and it does NOT change per task.
if not RULES.exists():
    RULES.write_text(
        "# Agent rules (procedural memory)\n"
        "- Prefer pnpm. Never mix package managers.\n"
        "- If a memory names a file or flag, verify it exists before acting.\n"
        "- Ship only after a green smoke test on the live URL.\n",
        encoding="utf-8",
    )


@dataclass
class MemoryMix:
    """Four backends, one per memory type. Working lives in RAM (volatile);
    the other three persist to disk."""
    working: dict = field(default_factory=dict)   # volatile scratchpad / todo

    # -- WORKING: volatile live state for the current task --------------------
    def set_task(self, task, todo):
        self.working = {"task": task, "todo": list(todo)}

    def check_off(self, item):
        todo = self.working.get("todo", [])
        self.working["todo"] = [t for t in todo if t != item]

    # -- EPISODIC: append one line per thing that happened --------------------
    def log_episode(self, kind, detail):
        rec = {"kind": kind, "detail": detail}
        with EPISODIC.open("a", encoding="utf-8") as f:
            f.write(json.dumps(rec) + "\n")

    def episodes(self):
        if not EPISODIC.exists():
            return []
        return [json.loads(ln) for ln in
                EPISODIC.read_text(encoding="utf-8").splitlines() if ln.strip()]

    # -- SEMANTIC: durable facts keyed by name (the Lab 06 store) -------------
    def write_fact(self, name, description, body):
        facts = self._facts()
        facts[name] = {"description": description, "body": body}
        SEMANTIC.write_text(json.dumps(facts, indent=2), encoding="utf-8")

    def _facts(self):
        if not SEMANTIC.exists():
            return {}
        return json.loads(SEMANTIC.read_text(encoding="utf-8"))

    # -- PROCEDURAL: load the rules file, every turn -------------------------
    def rules(self):
        return RULES.read_text(encoding="utf-8").strip()


if __name__ == "__main__":
    mix = MemoryMix()
    # Write one of each, so you can see all four stores light up.
    mix.set_task("ship the labs page", ["build", "smoke test", "deploy"])  # working
    mix.log_episode("deploy", "shipped labs v1 to prod, smoke green")      # episodic
    mix.write_fact("deploy-cmd",
                   "how this project ships",
                   "Run scripts/deploy.sh from the repo root.")            # semantic
    print("rules loaded:", len(mix.rules().splitlines()), "lines")         # procedural
    print("working todo:", mix.working["todo"])
    print("episodes on disk:", len(mix.episodes()))

Run it. You now have three files on disk: mix/rules.md (procedural), mix/episodes.jsonl (episodic), and mix/facts.json (semantic). The working scratchpad lives only in memory while the program runs. Four stores, four lifetimes. Working memory disappears when the program ends. Episodic memory grows forever. Semantic memory is the durable truth. Procedural memory is the set of standing rules. The bug everyone ships is collapsing all four into one file. Keep them apart and each one can do its one job well.

step 2

Compose the turn. Order matters, and recite the todo last.

The agent never dumps all four stores raw into the prompt. It composes them in priority order, and trims them to fit a size budget. Procedural rules go first, because they always apply. Then the most relevant semantic facts. Then the recent episodic history. Then the working scratchpad. Now one extra move, a trick the team behind the Manus agent wrote up: print the working to-do list again at the very end. A model pays the most attention to the start and the end of a long prompt, so anything stuck in the middle can get lost. Repeating the to-do at the end keeps the current task in clear view. Add this to the same file, above if __name__.

add to memory_mix.py

    # -- COMPOSE: build the turn's context in priority order -----------------
    def assemble(self, task, budget_chars=1200):
        """Order: procedural -> semantic -> episodic -> working, trimmed to a
        char budget. Then recite the working todo at the very END so recent
        attention holds it (the Manus recitation trick)."""
        q = set(task.lower().split())
        parts = []

        # 1. PROCEDURAL first -- the rules always apply.
        parts.append("[RULES]\n" + self.rules())

        # 2. SEMANTIC next -- only facts whose description matches the task.
        facts = self._facts()
        relevant = [f["body"] for f in facts.values()
                    if q & set(f["description"].lower().split())]
        if relevant:
            parts.append("[FACTS]\n" + "\n".join(relevant))

        # 3. EPISODIC -- the most recent few things that happened.
        recent = self.episodes()[-3:]
        if recent:
            lines = [e["kind"] + ": " + e["detail"] for e in recent]
            parts.append("[RECENT]\n" + "\n".join(lines))

        # 4. WORKING -- the live scratchpad.
        todo = self.working.get("todo", [])
        if todo:
            parts.append("[WORKING] task=" + self.working.get("task", "") +
                         "\ntodo: " + ", ".join(todo))

        context = "\n\n".join(parts)
        if len(context) > budget_chars:          # trim from the MIDDLE, keep
            head = context[: budget_chars // 2]   # rules (front) and working
            tail = context[-budget_chars // 2:]   # (back) intact
            context = head + "\n...[trimmed]...\n" + tail

        # RECITATION: working todo printed again, last, so it stays in view.
        if todo:
            context += "\n\n[RECITE -> do next] " + ", ".join(todo)
        return context

The order is the whole point. Rules come first because they are never optional. Facts and recent history sit in the middle, where cutting some is survivable. The working to-do comes last because that is what the model needs in front of it when it picks the next action. So when the prompt is too big, trim from the middle, never from the ends. That way you never cut the rules and you never cut the current task.

drop this at the bottom of memory_mix.py and run: the four sections, composed in order

mix = MemoryMix()
mix.set_task("ship the labs page", ["smoke test", "deploy"])
mix.log_episode("build", "labs page built clean")
mix.write_fact("deploy-cmd", "how this project ships",
               "Run scripts/deploy.sh from the repo root.")
print(mix.assemble("ship the labs page how does it ship"))
# ->
# [RULES]
# # Agent rules (procedural memory)
# - Prefer pnpm. Never mix package managers.
# ...
#
# [FACTS]
# Run scripts/deploy.sh from the repo root.
#
# [RECENT]
# build: labs page built clean
#
# [WORKING] task=ship the labs page
# todo: smoke test, deploy
#
# [RECITE -> do next] smoke test, deploy

step 3

The mix recipe, and promotion from episodic to semantic.

Each type has its own rule for when you write to it, and its own rule for when you read from it back. Keep this table in your head. It is the difference between an agent that composes a clean prompt and one that fills its long-term memory with short-lived junk.

the mix recipe

type        store              write policy              recall when
----------  -----------------  ------------------------  -----------------------
WORKING     RAM scratchpad     implicit, every action    every turn (recited last)
EPISODIC    episodes.jsonl     append on every event     pull recent / on replay
SEMANTIC    facts.json         explicit, deliberate      on relevance to the task
PROCEDURAL  rules.md           rare, by a human edit     every turn (loaded first)

Now the move that makes the mix smart: promotion. When the same fact shows up again and again across past runs, it has earned a place in durable memory. So you scan the episodic log, and any detail that repeats gets copied into the semantic store. From then on it is recalled by relevance, like any other durable fact. Carry the Lab 06 poison guard forward here. Before you promote a fact, if it names a file path or a command flag, re-check that against the live system. That way you never copy a stale, no-longer-true claim into durable memory. Add this to the same file, above if __name__.

add to memory_mix.py

    import os, re

    # Reuse the Lab 06 guard: anything reality can confirm, confirm it.
    _PATH_RE = re.compile(r"\b[\w./-]+\.(?:sh|py|ya?ml|json|toml|md|txt|js|ts)\b")

    def _verify_fresh(self, body):
        """Return (ok, problems). A named path that does not exist is stale."""
        problems = []
        for path in self._PATH_RE.findall(body):
            if not self.os.path.exists(path):
                problems.append("path does not exist: " + path)
        return (len(problems) == 0, problems)

    def promote_recurring(self, min_count=2):
        """Promote any episodic detail that recurs >= min_count times into the
        semantic store -- but only after the Lab 06 freshness guard clears it."""
        counts = {}
        for e in self.episodes():
            counts[e["detail"]] = counts.get(e["detail"], 0) + 1

        promoted = []
        for detail, n in counts.items():
            if n < min_count:
                continue
            ok, problems = self._verify_fresh(detail)
            if not ok:
                print("SKIP (stale, not promoted): " + detail)
                for p in problems:
                    print("  - " + p)
                continue
            name = "learned-" + str(abs(hash(detail)) % 10000)
            self.write_fact(name, "learned from repeated episodes", detail)
            promoted.append((name, detail, n))
        return promoted

Promotion is how an agent turns experience into knowledge. The working loop produces episodes. The episodes that keep proving true get lifted into semantic memory. And the guard makes sure only true things get lifted. That last part is the carry-forward from Lab 06, and it is what stops the mix from quietly poisoning itself.

log the same fact twice, then promote: it clears the guard and lands in semantic memory

mix = MemoryMix()
mix.log_episode("note", "the labs hub lives at lab/index.html")
mix.log_episode("note", "the labs hub lives at lab/index.html")   # recurs
print(mix.promote_recurring(min_count=2))
# (path exists -> guard clears -> promoted)
# -> [('learned-XXXX', 'the labs hub lives at lab/index.html', 2)]
#
# A recurring fact that named scripts/old_deploy.sh instead would print:
# SKIP (stale, not promoted): ...
#   - path does not exist: scripts/old_deploy.sh

There is no "memory." There are four, and mixing them up is the bug.VCN #33 | Hybrid Stack

Cram everything into one store and you get the classic failures. Behavior rules rot into stale "facts." Short-lived working state leaks into long-term memory, so last week's half-finished to-do gets recalled as if it were still true. The log of past runs grows so large it fills the prompt, and the model drowns in its own history and loses the actual task. One bucket cannot serve four lifetimes.

The mix is the fix: working state in a scratchpad you re-read, past runs in a log you can replay, durable facts in a guarded store, and behavior rules in a file loaded every turn. There are two ways to decide what gets written, so pick one per type. Anthropic's memory tool makes the write explicit: the agent chooses to write, and the write shows up in the run's record, so you can see exactly what it decided to remember. The tool Mem0 makes the write implicit: it quietly reads the conversation and pulls out whatever it guesses is worth keeping. Neither one is "memory" on its own. The full shape from the VCN #33 hybrid stack is all four working together: a loop holds the working state, a search index holds the durable facts, a CLAUDE.md holds the behavior rules, and a log holds the past runs.

single-bucket agent My notes file has my coding rules, three half-finished tasks, and a fact from last week all in one place. I cannot tell which is which, so I just paste the whole thing in and hope. [drowns by turn 20]

the mix Procedural rules loaded first. 2 relevant facts pulled by description. Working todo recited last so it stays in view. Clean context, every section doing one job. [PASS]

One agent guesses from a junk drawer. The other builds the prompt from four clean stores, taking exactly what the turn needs from each. That second one is the shape of every real, working agent that does not lose its mind.

hand this to your coding agent

Give <agent> a memory mix instead of one notes file: a working scratchpad I
recite at the end of each turn, an episodic jsonl log of past runs, a semantic
fact store guarded against stale writes, and a procedural rules.md loaded every
turn. Write assemble(task) that orders them procedural -> semantic -> episodic
-> working and trims to a token budget, and promote a fact from episodic to
semantic when it recurs.

checkpoint

Your agent now runs on four kinds of memory, each in the right store, built into a clean prompt every turn. Working memory is re-read so it stays in view. Episodic memory replays what happened. Semantic memory holds the guarded, durable truth. Procedural memory is the rules file, loaded first. Facts that keep recurring get promoted from the run log into durable memory, and the Lab 06 freshness guard keeps a stale one from slipping through. That is the hybrid stack, and it is the shape of every real, working agent that does not lose its mind by turn 30.

← 06 | memory that survives 08 | context engineering →