Nested enumeration to decrease file size #216
Replies: 2 comments
-
|
if you wanted to save more space and add some complexity you could make the enumeration indexing rather then using base 10 you could use base 62 (https://en.wikipedia.org/wiki/Base62) that way after enumeration index 10 to 61 still only requires 1 space, and enumeration 62 to 3843 only requires 2 spaces. You could do this for all array indexing in this. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for taking the time to think about compression tricks! To make sure I understand your proposal:
Conceptually this is a form of schema + dictionary encoding built into the format. For TOON specifically, this runs into a couple of deliberate design constraints:
Where I absolutely agree with you is that dictionary‑style compression is useful, especially for very repetitive columns. I think the right place for that, though, is on top of TOON as an application convention, not inside the core syntax. For example, you can already do: and describe in your prompt or application logic that Similarly, if you really want base‑62 codes, you can use them as strings in your own convention: TOON doesn't need to know that Given the spec's current goals and constraints (JSON data model only, decimal numbers only, no schema layer), I don't plan to add inline enumerations or base‑62 indices to the core format. I'm going to close this as out of scope for the spec, but I do appreciate the idea very much! 🙌 If you end up building a "TOON + dictionary encoding" convention or pre‑processor and have benchmark numbers (tokens, accuracy) for that vs plain TOON, I'd be very interested in seeing those results – they'd be great to link from the docs as an advanced pattern. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
nested enumeration to decrease file size
On finite values this enables not reprinting longer duplicates over and over.
1,Alice,admin
2,Bob,user
3,Bob1,user
4,Bob2,user
5,Bob3,user
6,Bob4,user
7,Bob5,user
...
Beta Was this translation helpful? Give feedback.
All reactions