Six steps. Ninety days.
A knowledge base that compounds.
The framework below combines a16z's context-layer construction model with the Forward Deployed Engineering practice the leading applied-AI shops use. We use it because it works — and because it matches the shape of every successful engagement we've seen. Each step is a concrete deliverable; each step is reviewable on its own.
Data accessibility
We catalog your existing data — the warehouse, the SaaS exports, the half-finished pipelines, the spreadsheets your operations team actually uses. Nothing gets thrown out. Everything gets a known path.
Most engagements start here because the prospect cannot answer 'where does revenue actually come from?' without three different people in the room. We make that question one query.
Automated context construction
We extract the implicit context from query history, data-modeling tools, schema migrations, and tribal knowledge in tickets. The knowledge base starts seeded — not blank.
Your team has been writing this context for years in PR descriptions, runbooks, Slack threads, and incident post-mortems. We don't ask you to re-write it. We capture what's already there.
Human refinement
Captured context gets reviewed and refined by people who actually own the domain. Implicit, conditional, exception-laden knowledge — the stuff that lives in Sarah's head — gets written down once, used forever.
This is the step every other vendor skips. It's also the step that determines whether the system works. We staff it as a workshop series, not a back-office task.
Agent connection
Your AI agents — whether they're customer-facing, internal-tool helpers, or analytics assistants — connect to the knowledge base via API or MCP. They retrieve what they need, when they need it, with full audit trail.
The knowledge base is a single source of truth. Adding a new agent is connecting another consumer, not building another silo.
Evals — prove it works
Before the system goes live, we build the evaluation framework that proves it earns its keep. We trace the way your best human handles a task and grade the agent on each step. We collect a small set of perfect-answer examples and measure every output against them. The goal is not 'looks good in a demo' — it's a defensible answer to 'is this actually working?'
Evals are how an executive trusts the agent will deliver ROI. Without them, every conversation about scope and budget devolves into opinion. With them, you have numbers. The two-technique approach — trace human steps + golden dataset — is the practitioner standard among Forward Deployed Engineering teams.
Self-updating flows
The agents do their own curation. Every interaction either reinforces existing knowledge, surfaces a gap, or proposes an update for human review. The knowledge base stays current — not because someone is paid to maintain it, but because the system maintains itself.
This is the ‘continuously-learning’ part. It is also where we differ from a one-time RAG-and-vector-database implementation. Static knowledge bases decay. Ours doesn't.
Stateful agents — not stateless workflows — paired with a knowledge base that grows from the work itself.
The six steps above describe the engagement shape. This section is the architectural distinction that determines what you actually receive at the end of it.
The workflow-vs-agent line
In December 2024, Anthropic published the canonical industry taxonomy in a post titled Building Effective Agents. The distinction is precise: a workflow is a system where LLMs and tools are orchestrated through predefined code paths. An agent is a system where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.
Most products marketed as “AI agents” today are workflows by this definition — prompt chaining, routing, orchestrator-worker patterns with LLM steps. They are useful. They are not agents.
Letta's critique
Letta — the UC Berkeley team behind the MemGPT memory research, $10M from Felicis on September 23, 2024 — sharpened the critique further:
“Most ‘agents’ today are essentially stateless workflows: they have no way to persist interactions beyond what fits into the context window.”
What our stateful agents do differently
Our agents persist across sessions. They have identity, memory that consolidates over time, accumulated experience that informs the next task. They are embedded in your environment — runtime, data layer, operational systems — and observe directly instead of sitting in a chat window. They log everything: every observation, every action, every outcome, every correction. They distill those raw signals into structured knowledge entries — guardrails, reasoning rules, pattern signatures, tree articles — that the next agent invocation can use.
The knowledge base from steps 1-6 above is not a separate artifact you have to maintain. It is the byproduct of the agents doing real work in your environment. The agents populate it. The agents read from it. The agents propose improvements to it. Your team reviews and steers.
Why this combination makes the knowledge base the best in the industry
A static knowledge base (Notion, Confluence) decays — humans pay the upkeep cost forever. A RAG-augmented chatbot retrieves but does not learn. An LLM-curated personal wiki (the Karpathy workflow) works for one user. A stateful agent without a shared knowledge base remembers per-agent but doesn't compound across agents. Only the combination — stateful agents plus continual-learning knowledge base plus environment embedding plus state-signal distillation — produces a knowledge base that grows automatically from real operational work and serves multiple agents drawing from it. That is what we ship.
What changes from a typical RAG project?
Every learning is sourced from a specific interaction, scored for confidence, and retrievable in audit-ready form. Useful for FedRAMP / FISMA work. Useful for SOC 2. Useful when you need to explain a decision six months later.
The agent reflects, scores its own outputs, retires stale entries, promotes high-utility patterns. The work that consultancies normally bill 4-hour sessions for — automatically, every iteration.
Most consultants leave a project and start the next one from zero. We leave a project and the next one starts at project N+1. Your price reflects the cost of adding to what we already know.
Built for your sector.
The framework is the same. The compliance posture, pricing shape, and engagement cadence change depending on whether you're a government program office or a 30-person operating business.