Alex ChernyshAlex ChernyshAgentic behaviorist · Tel Aviv
WritingAssistant
Back to notes

Note

SYNAPSE: adaptive-control software engineering, prototyped

SYNAPSE was a 2025 framework for AI agents that adapt their own success criteria via MCDM. The deterministic-control-plane idea later shipped as Bernstein.

July 15, 2025·8 min read
DeliveryResearch
On this page(8)
The shape of the loopThe synthetic experimentWhat workedWhat did not workWhat this becameKotef — durable single-agent runnerBernstein — the deterministic control plane at scaleRepositoriesRelated readingFurther reading

Update — April 2026

This was the prototype. The deterministic-control-plane idea later shipped as Bernstein. An intermediate prototype, Kotef, sat between the two as a durable single-agent runner; it eventually folded into Bernstein's multi-agent design.

Most agent loops in 2025 picked a fitness function once and held it constant. SYNAPSE was an attempt to put the decision about what "good" means for this run inside the loop.

TL;DR

SYNAPSE proposes an autonomous control loop that adapts its own success criteria using multi-criteria decision-making (MCDM). I validated the adaptive layer on a synthetic 2D pathfinding scenario. Small, single seed, directional rather than statistical. The conceptual win is separating WHAT the agent optimises for (a metric profile) from HOW it executes (the inner loop). The control plane is plain Python. Only the candidate generator is an LLM. The framework is a sketch. The experiment is a sanity check. Together they are enough to argue the shape.

What SYNAPSE proposed

  • adaptive metric selection: re-derive the weight vector each iteration, not once at the start
  • deterministic control plane: the orchestrator is Python, never an LLM scheduling itself
  • MCDM7 dimensions: PerfGain, SecRisk, DevTime, Maintainability, Cost, Scalability, DX
  • human-in-loop checkpoints at the boundaries where the metric profile changes
  • synthetic-data-native validation: small, fast, repeatable scenarios over hand-curated benchmarks

The shape of the loop

Most agent frameworks in mid-2025 read the same way. A planner emits a step. An executor runs it. A scorer reports a number. The loop closes. The fitness function (the thing that decides whether the run is going well) is a constant. You set it at the start and live with it.

SYNAPSE tries to break that assumption. In real engineering work the right trade-off is not stable. A first pass cares about correctness. A hardening pass cares about safety and risk. A spike near a deadline cares about wall-clock time at the expense of maintainability. If the agent operates across all those phases, it has to be allowed to revise the criteria, explicitly, legibly, with a record.

The loop has five steps. Generate a candidate. Validate it against quality gates. Score the result against the current metric profile. Adjust the profile if the scenario warrants. Pick the next move. The first and last touch an LLM. The middle three are deterministic Python.

The SYNAPSE control loop
Generation, validation, scoring, metric adjustment, next-move selection. Only generation touches the LLM; the rest is code.

The novel piece is step four. The agent reads the scenario, picks a metric profile (lean into safety because the corridor is noisy; lean into time because the deadline is hard), and re-evaluates the next candidate against the new weights. The MCDM7 vocabulary (PerfGain, SecRisk, DevTime, Maintainability, Cost, Scalability, DX) gives the profile a shape you can argue with. Weight vectors live in a config. Decision logs live in a file. Nothing hides in chat.

The synthetic experiment

The conceptual loop above asks for a much bigger evaluation harness than I built. What actually shipped is a single proof-of-concept run: a continuous 2D pathfinding problem under dynamic wind. Two agents try to move a simulated drone from a start point to a goal under conflicting pressures — time, energy, safety margin, payload integrity.

  • StaticAgent uses a fixed weight vector across the whole run.
  • SYNAPSEAgent reads the scenario, picks a metric profile (here: lean into safety because wind makes the corridor noisy), and re-evaluates each step.

The question was narrow: under one adversarial scenario, does adapting the criteria actually change the chosen path in a measurable way?

AgentEnergySafety (lower = safer)TimePath found
StaticAgent170.283.9759.71 syes
SYNAPSEAgent122.321.2461.50 syes

SYNAPSEAgent used ~28% less energy and scored ~3.2× better on the safety metric (1.24 vs 3.97), with a ~3% time penalty. The CSV is committed verbatim at results/experiment_results_20250708_225100.csv. Nothing has been smoothed.

What this evidence is, and is not

N=1 is not a result. It is a directional signal. One scenario, one seed, one weight schedule. The adaptive layer behaved exactly as designed on the easy case. Whether it generalises to richer environments is the open question. The honest answer is "I do not know yet, the harness is too small to claim that." The .dev/.plan.md file in the repo lists the factorial design with Mann–Whitney U and Cliff's δ that this experiment is missing.

What worked

A few things held up better than I expected.

The architecture pattern. Separating WHAT the agent optimises for from HOW it executes is a clean cut. Once the metric profile is a first-class object (a dict in a YAML file, not a sentence in a prompt), the agent stops arguing with itself about whether to favour speed or safety. It picks a profile, executes against it, and the next adjustment is visible in a diff.

The vocabulary. MCDM7 is opinionated enough to be useful and small enough to remember. It maps onto the trade-offs senior engineers already negotiate verbally. Making them explicit is the move.

Deterministic where possible. The orchestrator burns no LLM tokens on its own scheduling. Every decision the control plane makes is reproducible from a config and a seed. That is a debuggability argument. When something behaves wrong, you read the decision log instead of guessing what the model thought.

What did not work

Some pieces were genuinely premature.

No real CLI agents to orchestrate. Pre-Claude-Code era. No mature agentic CLI that takes a brief, edits files, runs tests, and returns. SYNAPSE describes a control plane for tools that did not yet exist. The LlamaAdapter in the prototype is a thin wrapper around Ollama. It can generate a candidate. It cannot operate a repository.

No grounding layer. Pre-MCP. The agent has no standardised way to call into a filesystem, a build tool, a linter, or a database. Every adapter has to be hand-rolled. The cost of "give the agent a real tool" is high enough that the prototype only validates the loop on a closed simulation.

The experiment is too small to claim anything statistical. One scenario, one seed, no factorial design, no significance test. The numbers are real. The story they tell is small. The roadmap file in the repo notes this as the gap that closes "research preview" into something publishable.

What this became

SYNAPSE was the sketch. The shape it argued for matured across two follow-up projects.

Kotef — durable single-agent runner

Kotef took SYNAPSE's loop and put it on real repositories. A supervisor flow (planner → researcher → coder → verifier → janitor) runs against a real codebase, with durable state in .sdd/runtime/, MCP-grounded tools, resume by thread ID. Single-agent. The metric profile became a quality-gate config. The adaptive layer became a backlog-driven planner that re-derives priorities each tick. Kotef is the reason I trusted that the loop survived contact with file systems.

Bernstein — the deterministic control plane at scale

What Kotef was for one agent, Bernstein is for many. A deterministic Python scheduler decomposes a goal, dispatches short-lived agents into isolated git worktrees, verifies output through a janitor, commits what survives. The LLM writes code. The orchestrator decides what runs and what merges. The decision log is a directory of plain files. The vocabulary changed (tasks, adapters, budget caps, MCP integration). The shape (generate, validate, score, adjust criteria, choose next move) is the same five-step loop SYNAPSE drew on the whiteboard. Kotef's lessons about durable state and backlog-driven planning landed there too.

The repo stays public because the lineage is more honest than the polish. SYNAPSE is a small piece of evidence that the loop works on the easy case. The production systems that came later are the real argument. This one is the index card pinned above the desk.

The orchestrator is the product. It is allowed to change its mind about what counts as success. The condition is that every such change stays legible and replayable, and that a human can override it.

Repositories

Repositories

  • SYNAPSE on GitHub — the 2025 prototype
  • Kotef on GitHub — durable single-agent runner that came next
  • Bernstein on GitHub — multi-agent control plane shipped from the same DNA
Related reading

Related reading

  • I ran 12 AI agents for 47 hours
  • Building agentic AI systems that hold up
  • Spec-driven development: the workflow I actually use
  • Getting AI-assisted development to green
References

Further reading

  • Hwang & Yoon, Multiple Attribute Decision Making (1981) — the TOPSIS lineage that runs through every adaptive-metric routine here.
  • Sutton & Barto, Reinforcement Learning: An Introduction (2nd ed., 2018) — the policy-iteration framing for the outer loop.
  • Brooks, The Mythical Man-Month (1975) — conceptual integrity as the engineer's first job.

— Alex Chernysh, alexchernysh.com

✓ Reading complete

Alex ChernyshAlex ChernyshApplied AI Systems & Platform Engineer

More on Delivery

Part of the public notes on grounded AI systems, retrieval, evals, and shipping under real constraints.

  • →Forecasting Without Prophecy: a plain-text disciplineMay 2, 2026·13 min read
  • →Getting AI-Assisted Development to Green Without Breaking the CodeMar 4, 2026·5 min read
  • →Most RAG Failures Start in the DocumentsFeb 12, 2026·5 min read
On this page
  • 01The shape of the loop1 min
  • 02The synthetic experiment1 min
  • 03What worked1 min
  • 04What did not work1 min
  • 05What this became
  • Kotef — durable single-agent runner
  • Bernstein — the deterministic control plane at scale1 min
  • 06Repositories
  • 07Related reading
  • 08Further reading