Prompt engineering used to be treated like copywriting with caffeine. It is closer to policy design: define the task, define the boundaries, show the pattern, and verify the result against reality. The magic phrase, regrettably, never arrived.
1. The prompt is not the product anymore
In modern systems, the prompt is one layer among several:
- system instruction
- user task framing
- retrieved context
- examples
- tool configuration
- output schema
- evals and downstream checks
That is why prompt engineering now feels less like wordsmithing and more like operating a multi-layer interface contract.
Prompt as phrasing
The team tweaks wording until the answer sounds better.
- style dominates the discussion
- examples are treated as optional
- output shape is vague
- regressions are discovered late
Prompt as policy
The prompt becomes one governed layer in a larger system.
- instructions are narrow and explicit
- examples show the target behavior
- schemas keep outputs machine-checkable
- evals decide whether the change helped
2. Clear instructions still win
The most durable advice in the field is still the least glamorous: make the instructions clear and specific.
The model should know:
- what role it plays
- what the task is
- what constraints matter
- what the output should look like
- what to do when the task is underspecified
Clarity still beats cleverness.
3. Few-shot examples are usually better than one more paragraph of prose
Strong examples compress the desired behavior into something visible.
They teach the model:
- the target format
- the expected level of brevity
- refusal behavior
- citation posture
- style boundaries
If the examples are good enough, some of the instruction prose can shrink.
4. Structure beats vibes
Anthropic's prompt-engineering docs still point toward the same practical set of techniques: clarity, examples, explicit structure, role prompting, thinking, and prompt chaining where it actually helps.
I like prompts that make each layer legible:
<role>You are a grounded assistant for production AI operations.</role>
<constraints>
- Use only retrieved context for factual claims.
- If support is missing, say so directly.
</constraints>
<context>[retrieved evidence]</context>
<task>[user question]</task>
<output_format>[exact shape]</output_format>
This does not make the model perfect. It does make failure easier to diagnose.
5. Tools and response formats matter as much as wording
Prompt quality is often decided outside the sentence layer.
In production systems, the real difference often comes from:
- which tools are available
- how narrowly their contracts are defined
- whether the output schema is strict enough to validate
- whether the model is allowed to abstain
This is one reason prompt conversations now overlap with product requirements and interface design. A prompt that looks elegant but sits on top of loose tools and vague outputs will still behave loosely.
6. Query transformation belongs in retrieval design, not in prompt mysticism
There is a temptation to treat every retrieval or routing improvement as a prompt breakthrough.
That is usually a category mistake.
Rewrite, decomposition, step-back prompting, and similar techniques can help retrieval materially, but they should be evaluated as search-control moves with latency costs, not as proof that someone discovered a more magical wording style.
If a query transformation helps, keep it. Just keep the reason honest.
7. Prompt changes should be evaluated like code changes
OpenAI's current eval guidance makes the operational consequence obvious: if the output matters, prompt changes belong in the same measurement culture as code changes.
Every material prompt change should answer:
- what behavior is intended to improve?
- which eval set should move?
- what new failure mode might appear?
- what does worse now look like?
If the change cannot be measured, the team is usually arguing about taste.
8. A good prompt does four jobs
The prompt is good when it does all four of these at once:
- narrows the task
- narrows the output shape
- narrows the model's freedom under uncertainty
- leaves enough flexibility for the useful part of the work
Miss one of those and the system drifts.