Mastodawn

Ed Summers Feb 10, 2024

JSON-LD can be nice to work with as a JSON object, for example:

https://kolektiva.social/@anarchivist/111905336837934109.json

But it can also be very difficult to work with, for example:

https://id.loc.gov/authorities/subjects/sh85079255.jsonld

Since you don't really know what you're going to get you need to use heavy RDF processing tools just to work with some JSON data. I think that's why people don't like JSON-LD.

Show thread

Patrick Hochstenbach Feb 10, 2024

@edsu This just depends on the tooks loc.gov used to create the JSON-LD, right. It just could be a JSON file, pretty readable, with a context inline or known to exists. I'm not a big fan of JSON-LD but this is not the reason. It is a compromise solution. As ugly and use-full as XML.

Show thread

Ed Summers

@hochstenbach yes, they could have chosen to publish it differently, but they didn't, and it's still valid JSON-LD. Uncertainty about how its going to be structured raises the bar for everyone who wants to use it.

But I guess different communities on the web could have norms of usage, that take some of the guess work out of parsing.

Show thread

Patrick Hochstenbach Feb 10, 2024

@edsu Indeed, it would have taken LOC 10 minutes of work to produce an output that is much easier to consume using a JSON-LD frame at their side. E.g. https://gist.github.com/phochste/39562b6cf51585d983208eaab61af22f

sh85079255.frame

GitHub Gist: instantly share code, notes, and snippets.

Gist

Show thread

Ed Summers Feb 10, 2024

@hochstenbach Yeah, I guess that's what I'm trying to say, that using JSON-LD requires a stack of software to process it (in addition to the usual JSON support).

Show thread

Jakob Voß Feb 10, 2024

@edsu @hochstenbach JSON-LD is misunderstood. Data providers should start with a clear JSON format to be usable without any knowledge or interest in RDF. Then add JSON-LD context on top, to make it RDF as well. In practice, it's often done the other way round, without any benefit compared to other RDF serializations.

Show thread

Adrian Feb 10, 2024

@nichtich @edsu @hochstenbach Yes! That is an essential part of Linked Open Usable Data and we try to do this with #lobid . Creating nice nested JSON to be easily consumed and – indexed in elasticsearch – queryable (rather intuitively if you make yourself a bit familiar with the JSON's fields and structure) via HTTP in complex ways on every level (though not as complex as SPARQL).

Show thread

Ed Summers Feb 11, 2024

@acka47 @nichtich @hochstenbach yes, agreed!

Show thread

Andromeda Yelton Feb 11, 2024

@edsu @hochstenbach I made this exact comment to them when I was doing a project on LoC data.

Show thread

Ruth [☕️ 👩🏻‍💻📚✍🏻🧵🪡🍵]Feb 11, 2024

@thatandromeda @edsu @hochstenbach trying to figure out how much to say about what an absurd pain we found it to parse the LCNAF as JSONLD as serialized by LC 😭

For actual processing, we output working files which had the data we needed and I think our script was still only at like -- 80% of getting things there, but we could throw out name-title authorities

Show thread

Ed Summers Feb 11, 2024

@thatandromeda @hochstenbach what did they say?

Show thread

Andromeda Yelton Feb 11, 2024

@edsu @hochstenbach i do not recall (it was in a context where they were gathering a lot of feedback and synthesizing it later)

Show thread

Ed Summers Feb 14, 2024

@thatandromeda @hochstenbach I got worked up enough to write a blog post, lol: https://inkdroid.org/2024/02/14/publishing-jsonld/

Hopefully it was ok to quote your post @acka47 ?

On Publishing JSON-LD

Show thread

Ruth [☕️ 👩🏻‍💻📚✍🏻🧵🪡🍵]Feb 14, 2024

@edsu aha, THIS I will cite!

Show thread

Ruth [☕️ 👩🏻‍💻📚✍🏻🧵🪡🍵]Feb 14, 2024

@edsu *emails graduate assistant with subject line: VINDICATION (JSON-LD)*

Show thread

Ruth [☕️ 👩🏻‍💻📚✍🏻🧵🪡🍵]Feb 14, 2024

@edsu it took us a _month_ to work through

and she was like "I thought I was skilled enough to process JSON with Python" and I had to reassure her that this was like -- something nobody really used -- for REASONS.

Show thread

Patrick Hochstenbach Feb 14, 2024

@edsu @thatandromeda @acka47 that is the way! I really hope loc will pick this up and output some pretty JSON that is also LD.

Show thread

Andromeda Yelton Feb 14, 2024

@edsu @hochstenbach @acka47 “probably easier to use one of the XML representations instead” _ouch_

Show thread

Adrian Feb 14, 2024

@edsu No problem, thanks! @thatandromeda @hochstenbach

Show thread

Adrian Feb 14, 2024

@edsu At #elag2019, we already did hands-on bootcamp on creating #LOUD from the #Bibframe works dataset, indexing it and using it e.g. with #Openrefine. Slides: https://hbz.github.io/elag2019-bootcamp/ repo: https://github.com/hbz/elag2019-bootcamp I haven't heard of any new Bibframe work bulk download since, though. @thatandromeda @hochstenbach

From LOD to LOUD: building and using JSON-LD APIs

Bootcamp at ELAG2019

Show thread

kcoyle checking the perimeter Feb 14, 2024

@edsu @thatandromeda @hochstenbach @acka47 "But really, in my opinion, it just means publishing things at URLs you intend to manage so people can link to them over time" Oh, yes! Kudos to Edsu, as always

Show thread

Matt Miller Feb 15, 2024

@edsu @thatandromeda @hochstenbach @acka47 There are specific reasons the json-ld comes out like that mostly due to the problem of the underlying system, marklogic does not have jsonld serialization natively, Kevin Ford wrote the conversion program over 10 years ago (https://github.com/kefo/rdfxq/tree/master/modules) so it was pretty new at the time it was created, but I agree, it would be great to update it to be more user friendly.

rdfxq/modules at master · kefo/rdfxq

XQuery Library for RDF. Contribute to kefo/rdfxq development by creating an account on GitHub.

GitHub

Show thread

Ed Summers Feb 15, 2024

@matt @thatandromeda @hochstenbach @acka47 thanks for this Matt! I would have thought marklogic included pretty solid functionality for converting xml to usable json by now? https://docs.marklogic.com/guide/app-dev/json#id_55967

Is there an application layer between the marklogic db and the web? Or is id.loc.gov actually implemented completely in xquery, inside of marklogic?!

Working With JSON (Application Developer's Guide) — MarkLogic Server 11.0 Product Documentation

MarkLogic is the only Enterprise NoSQL Database

Show thread

Matt Miller Feb 15, 2024

@edsu @thatandromeda @hochstenbach @acka47
To go from a xml doc to json representation it probably can but to do doc + sem triples store into a valid json-ld serialization there is no native way of doing it, that I’m aware of.

Yep, marklogic is a doc db/triple store and application layer built in. It’s all xquery code running everything.

Show thread

Ed Summers Feb 15, 2024

@matt @thatandromeda @hochstenbach @acka47 I guess what I'm suggesting is to go from XML to some kind of sane JSON, and then layer in whatever @context is needed for it to make sense as JSON-LD? The (probably offbase) assumption being that you are storing XML docs in MarkLogic?

Show thread

Matt Miller Feb 15, 2024

@edsu
All the docs are in the DB yes, I think the easiest solution is to modify the current existing conversion to produce "nicer" json-ld, which I think would be a great, and I can definitely mention it to the team.

Show thread

Ed Summers Feb 15, 2024

@matt ok great! Assuming that there are XML docs in the database, it seems like you could use existing MarkLogic support for generating JSON, and then add in whatever @context you need into that to make it JSON-LD?

Show thread

Matt Miller Feb 15, 2024

@edsu yeah possibly, will need to look at the outputs and the current process.

Show thread

Rob Sanderson Feb 15, 2024

@matt @edsu

LUX (https://lux.collections.yale.edu/) uses JSON-LD in MarkLogic with an automated extraction of the triples into the ML container, but that extraction doesn't happen natively in ML. That said, it's an easily countable number of lines of python to do it before loading, and could have been a trigger on document load within ML.

LUX: Yale Collections Discovery

Explore Yale University's cultural heritage collections