Articles

Why We Switched from JSON to TOON

Ibby SyedIbby Syed, Founder, Cotera
7 min readMay 20, 2026

One Change. 30% Less Spend. A Very Silly Name.

TOON vs JSON

What if I told you that you could earn an entire engineer's salary back just by reducing the number of curly braces you use? That's not quite true, but we saved almost $130,000 this year by switching from JSON to TOON.

What Cotera does is run tons of AI agents. Approximately 1.1 billion actions have been taken by AI agents on our platform. The other day, I was looking at a customer's order history. This was the output coming from Shopify and it was the payload that Shopify was returning to us. The customer's order history contained about 10 orders, but the actual JSON was about 200 lines long. It turns out that all of the quotation marks around items like name, email, and status increase the number of tokens your AI uses by a pretty significant percentage. Almost 40% of what we were sending was punctuation, which translates directly into tokens for AI models, affecting real costs in cents and dollars.

Initially, you might think, "Who cares? Why not just reduce the intelligence of the model I'm using or switch directly to an open-source one?" While that might be true, tokens are only cheap when you look at them per call. However, the moment you start making millions of calls, running agents in parallel, and processing data, the output shifts overwhelmingly from strings into JSON. For instance, if you're running a chatbot that responds to customer support inquiries, a simple greeting like "Hello" could generate tons of metadata, such as the email it's coming from, the person's location, potentially their browser, and a lot of other information. So "Hello, my name is Max" turns into 200 lines of metadata.

LLMs are trained on JSON, which was invented back in 2001 for web browsers. Douglas Crockford created JSON to transmit JavaScript objects, without considering the implications of using it with a four trillion parameter model and the associated costs.

One of our engineers sent me a link to a GitHub repository for a new format called TOON. I thought the name was amusing, but it turns out it's not a joke. TOON stands for Token-Oriented Object Notation, and it resembles a cooler, younger sibling of YAML. It doesn't use quotes, braces, or blank lines to separate records. The engineer suggested we could save 15% to 20% by switching our structured output variables from JSON to TOON. We decided to run an experiment.

The Experiment

Initially, we were skeptical because we assumed that, since LLMs are trained on JSON, they would perform much more accurately, even if we achieved cost savings. For us, accuracy is more valuable than cost because our customers pay us for actions taken, not for the amount of data processed. During the experiment, we took 12 real test cases — everything from customer support tickets to CRM objects to emails, calendar invites, and random product catalogs from Shopify. We tested each in JSON and also in TOON. For good measure, we included YAML in the mix. We submitted these different encodings to seven different types of LLM models, measured their accuracy, and counted their tokens.

Input tokens per 12 test cases (lower is better)

010k20k30kSonnet 4.6Opus 4.7GPT-5Mistral Large3DeepSeek V3.1MiniMax M2.5Kimi K2.5
JSONYAMLTOON

The results were interesting. Claude Sonnet used 24,759 input tokens for the JSON version, which reduced by 35% to about 16,089 when switched to TOON. For Opus, there was a 31% reduction from 28,923 to 19,935. GPT-5 saw a 29% drop. The savings ranged from about 27% to 35%, depending on the model, with TOON being the best performer. YAML provided cost savings over JSON, but TOON outperformed it every time.

Accuracy

Then came the question: we've saved ourselves 35% in costs, but what happens to accuracy? We looked at each model and posed 12 comprehension questions about the data given, comparing the responses to known answers.

Correct answers out of 12 (higher is better)

04812Sonnet 4.6Opus 4.7GPT-5Mistral Large3DeepSeek V3.1MiniMax M2.5Kimi K2.5
JSONTOON

Those results were intriguing as well. Opus answered 9 out of 12 correctly on JSON; on TOON, it improved to 11. DeepSeek scored 5 on JSON and 7 on TOON. MiniMax and Kimi both went 12 for 12 on both formats. The key finding was that not a single model performed worse on TOON than on JSON, which was quite interesting. If I were to hypothesize, the reason is likely the reduced context; smaller context leads to less model drift, yet I was surprised by the performance even with less data. I expected it to be roughly equivalent, but was truly surprised by the improvement.

What TOON Looks Like

When examining the output, the formats looked extremely similar, except for one missing piece: actual punctuation. Here's a customer record in JSON:

{
  "name": "Acme Corp",
  "plan": "enterprise",
  "mrr": 4200,
  "contacts": [
    {
      "name": "Sarah Chen",
      "role": "VP Sales",
      "email": "sarah@acme.com"
    },
    {
      "name": "Mike Ross",
      "role": "CRO",
      "email": "mike@acme.com"
    }
  ]
}

And the same thing in TOON:

name: Acme Corp
plan: enterprise
mrr: 4200
contacts:
  name: Sarah Chen
  role: VP Sales
  email: sarah@acme.com

  name: Mike Ross
  role: CRO
  email: mike@acme.com

There's no commas, brackets, or quotes. Indentation handles the nesting. At first glance, the differences may not seem significant, but when you scale it up to a million records, it results in a considerable cost reduction.

The Money

When you look at the financial aspect, Anthropic charges about three dollars per million input tokens on Claude Sonnet. Across the 12 test cases, TOON saved roughly 8,670 tokens against JSON. If you scale that for a million agent calls instead of a single benchmark, it translates to significant savings. We did some back-of-the-envelope calculations, and the savings from switching to TOON are likely close to what we pay a full-time junior implementation engineer, which is a substantial cost reduction.

To put it into perspective, we didn't have to refactor much; we simply switched the outputs from one style to another. We anticipate around a day's worth of adjustments to our custom code, but it shouldn't significantly impact our operations.

When examining output tokens, we noticed a separate story. We asked models to generate structured data in TOON rather than JSON. Most of them produced fewer tokens, although they don't always do it perfectly since structured output is still set up as JSON. Nevertheless, the cost savings from the API's tool outputs were significant enough to justify the investment we'll be making.

We Shipped It

So, what changes are we implementing in the business? Last week, we transitioned all of our tools to be TOON-based instead of JSON-based, migrating to ensure that everything any LLM on Cotera interacts with will be formatted in TOON rather than JSON. I'm also planning to host a very large pizza party for the engineer who initiated this change.

The benchmark code is available on GitHub for anyone who wants to try it. There are 12 comprehension cases for seven models across 10 generation cases. Running it takes about 15 minutes, and if you have any feedback or identify any gaps we missed, please let us know.

For people who think busywork is boring

Build your first agent in minutes with no complex engineering, just typing out instructions.