@navi @dalias @ska

I have been trying my hardest from keeping python out of the dependencies of gcc. Newer
Folks know python more than awk or shell it seems. Awk itself is a very powerful language. Especially when it comes to text processing. Even posix awk has stuff like array indexed by strings.

Right now all of the files generated by python are checked into the repo (a file which then get processed by awk anyways). I am hoping to keep it that way.

@pinskia @navi @dalias @ska I write a lot of Awk. It's very good for text processing. It's also much, much faster than Python.

I once had large CSV files I had to process. They got updated regularly and were about 2GiB. I wrote the processor in Python and it worked, but took a while to run. It wasn't a long time, but it was something you could pull out that XKCD comic about compiling for.

The CSV was regular enough that Awk would work. So I rewrote it in Awk and a shell pipeline. The result? It was about 10x faster. It made basically everything about working with it easier.

That was back in 2014. Even today I will start something in Python, realize Awk works just as well and switch. And I have done some rewrites of small tools for myself and the speed difference is still there.

And I've even done binary file handling with GNU Awk. I used it to process GOFF files on IBM mainframes. It was much easier than anything else.

@gwozniak @navi @dalias @ska

yes the speed of awk is impressive. I suspect part (most) of it is because the regex gets compiled while most folks using python does do compiling of regex at all and handling of states just falls over.

If we look at https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712421.html . The time it takes to process the `git log` is non-existant. It is the generation of the git log which takes almost all of the long time.

[RFC] Add community health scripts