#MarcXML encapsulated as a string in #JSON is a pain to work with as I am finding out trying to query #HathiTrust APIs for digitised items with a list of #OCLC numbers obtained via #SPARQL from #Wikidata.

Things I wish for:

1. direct access to the MARCXML from HathiTrust
2. a publicly accessible API for #Worldcat / OCLC to find holdings

#DigitalHumanities #libraries #metadata #bibliography

Over at Bluesky #HathiTrust responded:

> This may or not be helpful, but did you know you can add ".xml" to a catalog record? for example https://catalog.hathitrust.org/Record/000271011.xml

https://bsky.brid.gy/r/https://bsky.brid.gy/#bridgy-fed-dm-?-https://digitalcourage.social/users/tillgrallert-2025-12-10T19:30:02.313016+00:00

Hathi does indeed expose #MarcXML directly. This is great news! However, for this particular use case, where I am having a list of OCLC numbers and try to find them in HathiTrust, I still need to parse MarcXML from the encapsulating JSON served via https://catalog.hathitrust.org/api/volumes/full/oclc/{ID}.json

2nd update: I also figured out how to apply the new-ish JSON transformations of #XSLT 3 and #XPath 3.1 to parse the encapsulated #MarcXML