A thing that has always frustrated me about github/bitbucket, as a language designer, is that you can't teach the forge to syntax highlight files in your own custom formats.

Now the existence of Codeberg/git.gay means potentially I could create a PR to forgejo to add this feature and it would get added to the forges I actually use. Perhaps at some point I will do this.

Anybody know off the top of their heads what syntax-highlighter format Codeberg/Forgejegejo even uses?

Oh. it's… oh.

It's… custom Go code… on a per-language basis. they use something called Chroma and the way Chroma works is it wrote custom lexers in Go for each language they want to support. Um. Hm.

This is actually the one single approach they could have attempted which prevents custom pluggable highlighters on a per-repo basis.

https://hey.hagelb.org/@technomancy/statuses/01KNQJ9H3R64BEHE1QWNBXZKVW

technomancy (@[email protected])

@mcc last I checked it was https://github.com/alecthomas/chroma ;  I remember sending a patch to support Fennel and it was handled pretty promptly

hey.hagelb.org
@mcc you would really think that "one widely-supported declarative non-executable grammar format for syntax highlighting" would be a solved problem by now but it kinda feels like tree sitter is sucking up all the oxygen in that space; don't love how that's going

@technomancy @mcc Treesitter is the biggest engineering and design trash fire that I have seen in a long time.

If I had to give people advice on how to tackle the problem of grammars and editor support, I'd point them to TreeSitter and tell them to *not* do that.

#treesitter

@technomancy @mcc It's as if they looked at the existing problems and requirements, and then tried coming up with the dumbest "solutions" just for shits and giggles.

Even if I tried, I wouldn't be able to come up with the sheer density of painfully wrong decisions they made.

It's all "I'd love to understand the state of their mind that led them to believing that shit to be a valid design/engineering option" the way down.

#treesitter

@soc @technomancy Are you aware of any formats that are actively good for this, or is textmate the one option for data-driven approaches?

@mcc @technomancy No, I don't think good formats exist right now:

There is a rather empty niche between these simplistic, easy-to-build regex grammars and fully-featured IDE plugins.

TextMate2 grammars (and by extension Chroma grammars ... any basic syntax highlighting has likely TM2 grammars as a common ancestor) are "good" largely because of the little effort you need to spend to get things working.

@mcc @technomancy This was enough to get pretty nice syntax highlighting in IntelliJ (and other editors):

https://codeberg.org/core-lang/core/src/branch/main/tooling/core.tmbundle/Syntaxes/core.tmLanguage.json

The even more bare-bone Chroma grammar that's used on Codeberg looks like this:

https://github.com/alecthomas/chroma/blob/master/lexers/embedded/core.xml

If I wanted anything fancier, I would likely not invest further time into these grammars, but start implementing a language server or an IDE plugin.

core/tooling/core.tmbundle/Syntaxes/core.tmLanguage.json at main

core - compiler, runtime and standard library of the Core programming language –

Codeberg.org
@soc @technomancy hm, I'm confused. I glanced at the Chroma repo and it looked like Chroma was treesitter like , with each language being handled by custom Go code. Was I missing something?
@soc @technomancy put a different way: Say codeberg uses chroma. Could chroma be made to extract one of these xml files out of a directory in a codeberg repo and syntax highlight based on the local xml file? Would chroma be easy to patch to dynamically load such an xml file at runtime?