Alex Chernysh
AI systems / retrieval / evals / architecture
Discuss a system
SystemsWritingAssistant
Back to notes

Note

Spec-Driven Development: the workflow I actually use

How I use a lightweight spec-driven workflow in real projects, what SDDRush automates, and where Kotef fits if you want a stronger agent layer.

February 6, 20265 min readBy Alex Chernysh
AgentsWorkflow
Jump to section
1. What problem this solves2. What SDD means in practice3. What SDDRush actually gives youQuick start4. Where Kotef fits5. When this approach pays off6. What not to do7. Closing noteRepositories

Prefer a shorter pass first?

I use a lightweight spec-driven workflow in real projects. The point is simple: move intent, research, architecture, and tickets out of chat memory and into files. Over time I packaged the repeatable parts into SDDRush.

The loop in one pass

  • write the brief in plain language
  • gather only the research that changes decisions
  • make architecture explicit before implementation starts
  • sync the backlog into durable tickets
  • implement against the ticket, then clean up the state
Compact SDD view
The shape is intentionally small: pin down the work, turn it into tickets, then implement and clean up.

1. What problem this solves

The failure mode is old and still common: the plan mostly lives in chat.

That looks fast for a day or two. Then the task spans another session, another engineer, or another agent, and the shape of the work starts to drift.

What usually breaks first:

  • the original intent gets watered down
  • architecture gets re-decided in fragments
  • tickets stop matching the real plan
  • handoff depends on memory and optimism

That is the part I want to remove.

2. What SDD means in practice

In practice the workflow is compact:

  1. write a project brief
  2. gather focused research
  3. make architecture decisions explicit
  4. sync the backlog into concrete tickets
  5. implement against the open ticket
  6. clean up status and close what is actually done

None of this is exotic. That is the point. I am not trying to invent a new religion for software delivery. I want a repo trail that survives more than one conversation.

Practical threshold

If a task has enough ambiguity that you would need to explain your reasoning twice, it usually deserves a short spec trail once.

3. What SDDRush actually gives you

SDDRush is the small toolkit I built around this workflow.

It handles the parts I got tired of recreating by hand:

  • sdd-init creates the .sdd workspace and the basic file structure
  • sdd-prompts renders repo-aware prompts from the current project state
  • sdd-backlog syncs and maintains backlog tickets
  • sdd-status gives you a quick read on what is still open and what has already moved

That is enough for a lot of work already. If the repo matters, the task spans multiple sessions, and you do not want the project brief to dissolve into folklore, the toolkit is useful.

Quick start

bash bin/sdd-init /path/to/project --stack "Python/FastAPI" --domain "legal"
python bin/sdd-prompts /path/to/project
python bin/sdd-backlog sync /path/to/project
python bin/sdd-status /path/to/project

Then fill the core .sdd files, implement against backlog/open, and run janitor/status once the repo evidence catches up.

4. Where Kotef fits

If you want to push the same workflow further, Kotef is the agent layer I built around the same ideas.

I would frame it as the more ambitious companion, not the thing you need first.

Kotef can:

  • bootstrap SDD state for a repo
  • reason over tickets instead of freestyling from a naked prompt
  • run a planner -> researcher -> coder -> verifier -> janitor loop
  • keep runtime state, checkpoints, and resume paths for longer tasks

That makes it interesting when you want a durable coding and research agent for real repositories. It does not change the underlying point: the workflow needs structure before the agent deserves autonomy.

5. When this approach pays off

I reach for this setup when:

  • the task has more than one non-trivial decision
  • the repo context matters
  • the work will span more than one session
  • another engineer may need to audit the path later

I do not reach for it when the task is tiny and the overhead would outweigh the value.

This is also the kind of workflow I trust for benchmark-style work where grounding, latency, and telemetry all matter at once. A current example is the Agentic RAG Legal Challenge, where the system is judged as a full pipeline rather than as a clever prompt.

What translated well there was not drama, but discipline:

  • small diffs with obvious lineage
  • explicit gates on grounding, latency, and provenance
  • fast rejection of branches that looked clever but made the evidence trail worse

What did not help was broadening context indiscriminately, changing too many variables at once, or trusting a cleaner-looking answer before the path behind it was stable.

6. What not to do

The common mistakes are predictable:

  • turning every small bugfix into a ceremony
  • starting with the agent while the task is still vague
  • treating prompt output as durable project memory
  • writing specs that sound complete but do not constrain implementation

The method should reduce guesswork. If it mainly produces prettier paperwork, something has gone wrong.

7. Closing note

This workflow is useful because it makes intent easier to inspect, implementation easier to hand off, and backlog state harder to fake. SDDRush exists because I use this pattern enough to want the boring parts scaffolded. Kotef exists because sometimes I want to automate more of the same loop without throwing the structure away.

Resources

Repositories

  • SDDRush on GitHub
  • Kotef on GitHub

Related reading

Part of the public notes on grounded AI systems, retrieval, evals, and delivery under real constraints.

Getting AI-Assisted Development to Green Without Breaking the CodeBuilding Agentic AI Systems That Hold UpPrompt Engineering: From Phrasing to Policy
On this page
1. What problem this solves2. What SDD means in practice3. What SDDRush actually gives youQuick start4. Where Kotef fits5. When this approach pays off6. What not to do7. Closing noteRepositories