AI Insights
Prompt Context Types: Key Experimental Findings
– Written by Valdemaras Girštautas, Jr, JavaScript Software Engineer
Have you ever wondered whether to force the model to output everything in, for example, a JSON format, when working with large language models (LLMs)? The answer is not as simple as it seems, and I’ll explain why.
When I started working with LLMs, I thought, “Why not just force everything into a JSON format? It’s simple, structured, and easy to integrate with other tools”. So, I did just that. But after some time and experimentation, I found that this isn’t always the best approach. In this article, I’ll share what I learned.
Not All Tasks Are Equal
Not all tasks are equal, and I learned this the hard way. For example, there is a big difference between multi-step math tasks, classification tasks (when you need to select one of a few categories), coding tasks, and returning structured data.
For convenience, I call these task categories: reasoning, classification, coding, and structured output tasks. Once you distinguish between these categories, many seemingly strange behaviours become understandable.
Reasoning Tasks
Let’s consider tasks that require reasoning. Suppose you ask the model a simple question: John worked overtime and received additional compensation. What was John’s salary? Clearly, answering this question requires some reasoning. If you allow the model to respond freely (without enforcing a specific output format), it will typically provide a step-by-step explanation. However, if you require the model to return the answer in a strict JSON format, it must handle two tasks simultaneously: first, reasoning its way to the correct answer, and second, structuring that answer according to the specified schema. As you might expect, this decreases model performance for reasoning tasks. I observed that the model may:
- Omit some of the reasoning steps
- Provide shorter reasoning
- Return slightly less accurate answers
Note that it is not because the model cannot reason correctly and provide an accurate answer, but because it must simultaneously perform two tasks: reasoning and structuring.

Classification Tasks
However, if you have classification tasks (you need to select one of a few categories) and enforce the model to output the answer in a strict JSON format, you will likely observe the opposite behaviour. Instead of writing a whole paragraph with an answer, the model will return a simple {label:“relevant”} object. Structured output is beneficial for classification tasks. It reduces unnecessary information (noise), improves answer consistency, and makes it easier to integrate the output with other tools.
So, structured output is not bad, it’s just not always beneficial.
It’s Not About JSON, It’s About the Format Itself
Most people would stop at JSON. But in reality, you’ll be working with XML and YAML too. And they’re very different beasts.
- JSON, the safe default: JSON is popular for a reason. It’s compact and consistent, and it works great for simple data structures. But once you start getting into nested objects, lists of lists, and similar complexities, it can get a bit dicey. The model knows the answer, but it still gets the structure wrong.
- XML, structured, but heavy: XML is more structured. Everything has to be tagged, and this can sometimes help the model get it right. But there’s a lot of overhead. It’s very verbose, and small mistakes (like using an opening tag that doesn’t match the closing one) can cause the whole thing to fail. It’s a bit like writing with too many brackets, technically correct, but a bit exhausting.
- YAML, readable, but risky: YAML is much nicer to look at. It seems almost… human. But it’s deceptively finicky. One misplaced indent, and the whole thing fails. It’s a bit like writing Python code without the brackets, great when it works, but infuriating when it doesn’t.
The Real Trade-off
The biggest insight for me was this: The stricter the format, the less freedom the model has to think. Natural language gives the model room. Structure gives you control. And you’re always trading one for the other.
What Actually Works in Practice
After some experimentation with different formats, I eventually settled on this: if the task requires reasoning, let the model respond in free text first. If I need something structured, I convert the response into the desired format afterwards.
This two-step process works surprisingly well. It’s a bit like letting someone explain how they think about something and then asking them to condense it into a simple summary. You get the best of both worlds.
So, What Should You Use?
There’s no hard-and-fast answer, but here’s my new mental framework:
- Use natural language when the task requires some thinking
- Use JSON for simple, structured data outputs
- Avoid using YAML unless you have to (it looks great, but breaks easily)
- Only use XML when you really need tags
- Use whatever you need, but don’t assume one format will work for everything
Final Thought
Before this, I used to think of output formats as a minor implementation detail. Now I see them as part of the prompt itself. The format doesn’t just determine the response – it determines the thinking. So instead of asking, “What format should I use?”, a better question is: “How much freedom does this task require?” Because sometimes, the secret to better performance is simply giving the model a bit more wiggle room.
-
Learn more -
Learn moreFrom Junior to Principal: Leading Work Across AI Initiatives
-
-
Learn more80% AI, 20% Human: Why IT Roles Are Blurring Forever
-
Volha Khudzinskaya, Head of QM, and Dzmitry Mikhailouski, Lead SDET
Learn moreOutdated and Outgrown: The Tech QA Should Retire Now
-