Mastodawn

A Faster Alternative to Jq

jsongrep is faster than {jq, jmespath, jsonpath-rust, jql}

An introduction to the jsongrep tool, a technical explanation of its DFA-based search engine, and performance results against popular JSON query tools.

Micah's Secret Blog

Show thread

Bigpet 2d ago

When initially opening the page it had broken colors in light mode. For anyone else encountering it: switch to dark mode and then back to light mode to fix it.

Show thread

keysersoze33 2d ago

I had the same problem (brave browser)

Show thread

jvdvegt 2d ago

Fine in Firefox on Android.
Note that the scales of the charts are all different, which makes them hard to compare.

Also, there are lots of charts without comparison so the numbers mean nothing...

Show thread

CodeCompost 2d ago

I suspect the website is vibe-coded, like the tool itself.

Show thread

shellac 2d ago

I think this has just been fixed. A bit of dark mode was leaking into light in the css.

Show thread

majewsky 1d ago

I still saw the same bug just now (Firefox on macOS).

Show thread

Kovah 2d ago

I wonder so often about many new CLI tools whose primary selling point is their speed over other tools. Yet I personally have not encountered any case where a tool like jq feels incredibly slow, and I would feel the urge to find something else.
What do people do all day that existing tools are no longer enough? Or is it that kind of "my new terminal opens 107ms faster now, and I don't notice it, but I simply feel better because I know"?

Show thread

n_e 2d ago

I process TB-size ndjson files. I want to use jq to do some simple transformations between stages of the processing pipeline (e.g. rename a field), but it so slow that I write a single-use node or rust script instead.

Show thread

nchmy 2d ago

This isn't for you then

> The query language is deliberately less expressive than jq's. jsongrep is a search tool, not a transformation tool-- it finds values but doesn't compute new ones. There are no filters, no arithmetic, no string interpolation.

Mind me asking what sorts of TB json files you work with? Seems excessively immense.

[dead]

> Uses jq for TB json files

> Hadoop: bro

> Spark: bro

> hive: bro

> data team: bro

Show thread

messe 2d ago

Now I'm really curious. What field are you in that ndjson files of that size are common?

I'm sure there are reasons against switching to something more efficient–we've all been there–I'm just surprised.

Show thread

eru 2d ago

This reminds me of someone who wrote a regex tool that matches by compiling regexes (at runtime of the tool) via LLVM to native code.

You could probably do something similar for a faster jq.

Show thread

InfinityByTen 2d ago

You don't know something is slow until you encounter a use case where the speed becomes noticeable. Then you see the slowness across the board. If you can notice that a command hasn't completed and you are able to fully process a thought about it, it's slow(er than your mind, ergo slow!).

Usually, a perceptive user/technical mind is able to tweak their usage of the tools around their limitations, but if you can find a tool that doesn't have those limitations, it feels far more superior.

The only place where ripgrep hasn't seeped into my workflow for example, is after the pipe and that's just out of (bad?) habit. So much so, sometimes I'll do this foolishly rg "<term>" | grep <second filter>; then proceed to do a metaphoric facepalm on my mind. Let's see if jg can make me go jg <term> | jq <transformation> :)

Show thread

password4321 1d ago

Optimization = good

Prioritizing SEO-ing speed over supporting the same features/syntax (especially without an immediately prominent disclosure of these deficiencies) = marketing bullshit

A faster jq except it can't do what jq does... maybe I can use this as a pre-filter when necessary.

Show thread

swiftcoder 1d ago

Deal with really big log files, mostly.

If you work at a hyperscaler, service log volume borders on the insane, and while there is a whole pile of tooling around logs, often there's no real substitute for pulling a couple of terabytes locally and going to town on them.

Show thread

xlii 1d ago

It's a simple loop:

- Someone likes tool X

- Figures, that they can vibe code alternative

- They take Rust for performance or FAVORITE_LANG for credentials

- Claude implements small subset of features

- Benchmark subset

- Claim win, profit on showcase

Note: this particular project doesn't have many visible tells, but there's pattern of overdocumentation (17% comment-to-code ratio, >1000 words in README, Claude-like comment patterns), so it might be a guided process.

I still think that the project follows the "subset is faster than set" trend.

Show thread

hackrmn 2d ago

Having used `jq` and `yq` (which followed from the former, in spirit), I have never had to complain about performance of the _latter_ which an order of magnitude (or several) _slower_ than the former. So if there's something faster than `jq`, it's laudable that the author of the faster tool accomplished such a goal, but in the broader context I'd say the performance benefit would be required by a niche slice of the userbase. People who analyse JSON-formatted logs, perhaps? Then again, newline-delimited JSON reigns supreme in that particular kind of scenario, making the point of a faster `jq` moot again.

However, as someone who always loved faster software and being an optimisation nerd, hat's off!

Show thread

bungle 2d ago

Integrating with server software, the performance is nice to have, as you can have say 100 kRPS requests coming in that need some jq-like logic. For CLI tool, like you said, the performance of any of them is ok, for most of the cases.

Show thread

mroche 2d ago

> Having used `jq` and `yq`

If you don't mind me asking, which yq? There's a Go variant and a Python pass-through variant, the latter also including xq and tomlq.

Show thread

jiehong 2d ago

First of all, congratulations! Nice tool!

Second, some comments on the presentation: the horizontal violin graphs are nice, but all tools have the same colours, and so it's just hard to even spot where jsongrep is. I'd recommend grouping by tool and colour coding it. Besides, jq itself isn't in the graphs at all (but the title of the post made me think it would be!).

Last, xLarge is a 190MiB file. I was surprised by that. It seems too low for xLarge. I daily check 400MiB json documents, and sometimes GiB ones.

Show thread

1a527dd5 2d ago

I appreciate performance as much as the next person; but I see this endless battle to measure things in ns/us/ms as performative.

Sure there are 0.000001% edge cases where that MIGHT be the next big bottleneck.

I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

9/10 whatever tooling you are using now will be perfectly fine. Example; I use grep a lot in an ad hoc manner on really large files I switch to rg. But that is only in the handful of cases.

Show thread

dalvrosa 2d ago

Fair, but agentic tooling can benefit quite a lot from this

Opencode, ClaudeCode, etc, feel slow. Whatever make them faster is a win :)

Show thread

jamespo 1d ago

It's not running jq locally that's causing that

Show thread

httpsterio 1d ago

The 2ms it takes to run jq versus the 0.2ms to run an alternative is not why your coding agent feels slow.

Show thread

montroser 2d ago

Then this is for the handful of cases for you. When it matters it matters.

Show thread

mikojan 1d ago

> I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

> 9/10 whatever tooling you are using now will be perfectly fine

Are you working in frontend? On non-trivial webapps? Because this is entirely wrong in my experience. Performance issues are the #1 complaint of everyone on the frontend team. Be that in compiling, testing or (to a lesser extend) the actual app.

Show thread

lemagedurage 1d ago

True. I feel like the main way a tool could differentiate from jq is having more intuitive syntax and many real world examples to show off the syntax.

Show thread

j1elo 1d ago

Whenever you have this kind of impressions on some development, here are my 2 cents: just think "I'm not the target audience". And that's fine.

The difference between 2ms and 0.2ms might sound unneeded, or even silly to you. But somebody, somewhere, is doing stream processing of TB-sized JSON objects, and they will care. These news are for them.