Note

SYNAPSE: adaptive-control software engineering, prototyped

SYNAPSE was a 2025 framework for AI agents that adapt their own success criteria via MCDM. The deterministic-control-plane idea later shipped as Bernstein.

July 15, 20258 min read

Delivery Research

On this page(8)

Most agent loops in 2025 picked a fitness function once and held it constant. SYNAPSE was an attempt to put the decision about what "good" means for this run inside the loop.

The shape of the loop

Most agent frameworks in mid-2025 read the same way. A planner emits a step. An executor runs it. A scorer reports a number. The loop closes. The fitness function (the thing that decides whether the run is going well) is a constant. You set it at the start and live with it.

SYNAPSE tries to break that assumption. In real engineering work the right trade-off is not stable. A first pass cares about correctness. A hardening pass cares about safety and risk. A spike near a deadline cares about wall-clock time at the expense of maintainability. If the agent operates across all those phases, it has to be allowed to revise the criteria, explicitly, legibly, with a record.

The loop has five steps. Generate a candidate. Validate it against quality gates. Score the result against the current metric profile. Adjust the profile if the scenario warrants. Pick the next move. The first and last touch an LLM. The middle three are deterministic Python.

The SYNAPSE control loop

Generation, validation, scoring, metric adjustment, next-move selection. Only generation touches the LLM; the rest is code.

The novel piece is step four. The agent reads the scenario, picks a metric profile (lean into safety because the corridor is noisy; lean into time because the deadline is hard), and re-evaluates the next candidate against the new weights. The MCDM7 vocabulary (PerfGain, SecRisk, DevTime, Maintainability, Cost, Scalability, DX) gives the profile a shape you can argue with. Weight vectors live in a config. Decision logs live in a file. Nothing hides in chat.

The synthetic experiment

The conceptual loop above asks for a much bigger evaluation harness than I built. What actually shipped is a single proof-of-concept run: a continuous 2D pathfinding problem under dynamic wind. Two agents try to move a simulated drone from a start point to a goal under conflicting pressures — time, energy, safety margin, payload integrity.

StaticAgent uses a fixed weight vector across the whole run.
SYNAPSEAgent reads the scenario, picks a metric profile (here: lean into safety because wind makes the corridor noisy), and re-evaluates each step.

The question was narrow: under one adversarial scenario, does adapting the criteria actually change the chosen path in a measurable way?

Agent	Energy	Safety (lower = safer)	Time	Path found
StaticAgent	170.28	3.97	59.71 s	yes
SYNAPSEAgent	122.32	1.24	61.50 s	yes

SYNAPSEAgent used ~28% less energy and scored ~3.2× better on the safety metric (1.24 vs 3.97), with a ~3% time penalty. The CSV is committed verbatim at results/experiment_results_20250708_225100.csv. Nothing has been smoothed.

What worked

A few things held up better than I expected.

The architecture pattern. Separating WHAT the agent optimises for from HOW it executes is a clean cut. Once the metric profile is a first-class object (a dict in a YAML file, not a sentence in a prompt), the agent stops arguing with itself about whether to favour speed or safety. It picks a profile, executes against it, and the next adjustment is visible in a diff.

The vocabulary. MCDM7 is opinionated enough to be useful and small enough to remember. It maps onto the trade-offs senior engineers already negotiate verbally. Making them explicit is the move.

Deterministic where possible. The orchestrator burns no LLM tokens on its own scheduling. Every decision the control plane makes is reproducible from a config and a seed. That is a debuggability argument. When something behaves wrong, you read the decision log instead of guessing what the model thought.

What did not work

Some pieces were genuinely premature.

No real CLI agents to orchestrate. Pre-Claude-Code era. No mature agentic CLI that takes a brief, edits files, runs tests, and returns. SYNAPSE describes a control plane for tools that did not yet exist. The LlamaAdapter in the prototype is a thin wrapper around Ollama. It can generate a candidate. It cannot operate a repository.

No grounding layer. Pre-MCP. The agent has no standardised way to call into a filesystem, a build tool, a linter, or a database. Every adapter has to be hand-rolled. The cost of "give the agent a real tool" is high enough that the prototype only validates the loop on a closed simulation.

The experiment is too small to claim anything statistical. One scenario, one seed, no factorial design, no significance test. The numbers are real. The story they tell is small. The roadmap file in the repo notes this as the gap that closes "research preview" into something publishable.

What this became

SYNAPSE was the sketch. The shape it argued for matured across two follow-up projects.

Kotef — durable single-agent runner

Kotef took SYNAPSE's loop and put it on real repositories. A supervisor flow (planner → researcher → coder → verifier → janitor) runs against a real codebase, with durable state in .sdd/runtime/, MCP-grounded tools, resume by thread ID. Single-agent. The metric profile became a quality-gate config. The adaptive layer became a backlog-driven planner that re-derives priorities each tick. Kotef is the reason I trusted that the loop survived contact with file systems.

Bernstein — the deterministic control plane at scale

What Kotef was for one agent, Bernstein is for many. A deterministic Python scheduler decomposes a goal, dispatches short-lived agents into isolated git worktrees, verifies output through a janitor, commits what survives. The LLM writes code. The orchestrator decides what runs and what merges. The decision log is a directory of plain files. The vocabulary changed (tasks, adapters, budget caps, MCP integration). The shape (generate, validate, score, adjust criteria, choose next move) is the same five-step loop SYNAPSE drew on the whiteboard. Kotef's lessons about durable state and backlog-driven planning landed there too.

The repo stays public because the lineage is more honest than the polish. SYNAPSE is a small piece of evidence that the loop works on the easy case. The production systems that came later are the real argument. This one is the index card pinned above the desk.

The orchestrator is the product. It is allowed to change its mind about what counts as success. The condition is that every such change stays legible and replayable, and that a human can override it.

Repositories

SYNAPSE on GitHub — the 2025 prototype
Kotef on GitHub — durable single-agent runner that came next
Bernstein on GitHub — multi-agent control plane shipped from the same DNA