I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as \" and \\, increasing payload size.
String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you Cow<str> if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing \"->" replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.
Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's [ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.
Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)
Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above: "nextpage":"{\"url\":\"https…
Even funnier, there are body field inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.
And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
#json #msgpack #bson
#web #performance #optimization
#advice