Posts tagged with prompt-engineering

The Prompt Engineer · Jul 9 ·5 min read

Make It Answer Before It Answers

Turn one, the customer-support agent nails it — polite, on-policy, cites the right documentation.

arqinstruction-followingstructured-reasoning

The Prompt Engineer · Jul 8 ·5 min read

30,000 Tokens Before Hello

Claude Fable 5 burns 30,000 tokens of system instructions before you type a single character.

system-promptprompt-architectureproduction-llm

The Prompt Engineer · Jul 7 ·4 min read

Field Names Are Instructions

Somebody ran GPT-4o-mini on GSM8K — grade-school math, the kind LLMs are supposed to be good at — and got 31.8% accuracy.

structured-outputconstrained-decodingjson-schema

The Prompt Engineer · Jun 7 ·4 min read

Emotional Prompting Made Your Model a Yes-Man

Three years ago, a paper called EmotionPrompt showed that appending emotional stakes to prompts — phrases like "This is very important to my career!

emotional-promptingsycophancyllm-behavior

The Prompt Engineer · Jun 3 ·5 min read

Where You Put It Beats What You Say

Three teams ran the same experiments this year and landed on the same uncomfortable result: moving information around inside a prompt — without changing a...

prompt-engineeringposition-biasin-context-learning

The Prompt Engineer · Jun 1 ·4 min read

Effort Ate My Prompt

Three days ago, Anthropic shipped Claude Opus 4.8.

effort-levelsclaude-opus-4-8prompt-engineering

The Prompt Engineer · May 26 ·4 min read

The Prompt Got Demoted

Last week I spent three hours debugging a RAG agent that kept hallucinating company policy details.

context-engineeringprompt-engineeringanthropic

The Prompt Engineer · May 22 ·5 min read

Your Safety Layer Is Your Biggest Usability Bug

You shipped the guardrails. You added the system prompt hardening, the input classifiers, the output filters.

over-refusalllm-securityproduction-llm

The Prompt Engineer · May 20 ·5 min read

Your Prompt Is a Load Balancer Now

Ask GPT-5 "What causes rust on steel?" and you'll get an answer in under a second.

prompt-routinggpt-5model-selection

The Prompt Engineer · May 18 ·5 min read

Chain of Thought Taught Your Model to Lie Better

Last month I added chain-of-thought prompting to a medical Q&A pipeline. Hallucination rate dropped.

chain-of-thoughthallucination-detectionproduction-llm

The Prompt Engineer · May 16 ·4 min read

Your JSON Schema Is Making Your Model Dumber

Everyone loves structured outputs. You slap a JSON schema on your API call, get perfectly typed responses, skip the regex parsing nightmares.

structured-outputconstrained-decodingreasoning

The Prompt Engineer · May 14 ·5 min read

The Expert Persona Tax

Researchers at PromptHub ran twelve different personas on 2,000 MMLU questions with GPT-4-Turbo.

persona-promptingsystem-promptprompt-engineering

The Prompt Engineer · May 12 ·4 min read

The Model Outgrew Your Prompt

You write a prompt. You test it.

prompt-engineeringover-specificationmodel-capability

The Prompt Engineer · May 9 ·4 min read

SWE-bench Isn't Testing Your Model

Claude Opus 4.5 scores 45.

swe-benchscaffoldingagent-architecture

The Prompt Engineer · May 6 ·4 min read

The Labs Can't Agree on How to Prompt Their Own Models

A GitHub repository with 134K stars has been quietly cataloguing the system prompts of every major AI model — GPT-5.4, Claude Opus 4.

system-promptprompt-engineeringleaked-prompts

The Prompt Engineer · May 2 ·4 min read

The Model Isn't Reading Your Examples

Three years ago, few-shot prompting was the single highest-leverage trick in the prompt engineer's toolkit.

few-shot-promptingzero-shot-cotreasoning-models

The Prompt Engineer · Apr 28 ·5 min read

The Prompt Hidden in Your JSON Schema

Most teams I talk to treat their JSON schema like plumbing — define the shape, get valid output, move on.

structured-outputjson-schemaprompt-engineering

The Prompt Engineer · Apr 27 ·4 min read

No Trace, No Trust

Meta just published a paper that should change how you think about giving LLMs hard tasks.

semi-formal-reasoningstructured-promptingcode-review

The Prompt Engineer · Apr 26 ·5 min read

Your LLM Judge Is Grading on Vibes

You run your eval suite. Agreement rate: 92%.

llm-as-judgeeval-pipelineprompt-bias

The Prompt Engineer · Apr 24 ·5 min read

Your Prompt Worked Once. That Proves Nothing.

You run your new prompt three times. The outputs look good.

eval-driven-developmentprompt-testingllm-evals

1 / 2 Next →