Back after lunch with Peter Leonard & Lindsay King, Stanford University Library โ€“ Beyond ChatGPT: Transformers Models for Collections

Chat-based interfaces are too popular to ignore as ways of interacting with cultural material (even if we don't like the chatgpt turn). So how can we make it better in our CHI context, and help people understand what it can and can't do? #FF24 #FF2024 #aiart 4LAM

While I remember, Grant's NFSA presentation #FF2024 on granular access controls for AV records and transcriptions reminded me of my current favourite example, https://museumdata.uk

My notes from their launch https://www.openobjects.org.uk/2024/09/notes-from-the-museum-data-service-launch/ #FF24 #AI4LAM

Museum Data Service

A free new service that aims to bring together and share all the object records of all UK museums.

Museum Data Service

Two of Lindsay's examples of 'evocative search' using CLIP - overcast, waves - gorgeous ways of accessing images via clustering #FF24 #FF2024 #AI4LAM

Now Peter's advocating for experiments with RAG - retrieval augmented generation - slightly more accurate chatGPT

Sharing some links for my #FF2024 talk on our survey of projects seeking to integrate enriched data with GLAM systems - an earlier poster with some results https://collectivewisdomproject.org.uk/dh2024-poster-treasures-on-an-island/ and background on our survey https://collectivewisdomproject.org.uk/survey-integrating-volunteer-and-ai-enriched-metadata-into-collections-systems/ #FF24 #AI4LAM

Lindsay and Peter sharing examples with multimodal models Florence-2; LLaVA, LLaVA-NEXT; mPLUG-Owl2

Lindsay notes, 'Some models pad their text without adding more information, like a kid writing a report'

#FF2024 #FF24 #AI4LAM

Peter gives an example of the ability of a model to describe visual cues - it's worth reading his prompt for a description of Apple executives #FF24 #FF2024 #AI4LAM

Now it's @joshuatj Joshua Ng, Archives New Zealand โ€“ Exploring machine learning to transform Archives New Zealandโ€™s digital services for agencies #FF24 #FF2024 #AI4LAM

Got some funds for a proof of concept to streamline the archival appraisal process, identify material of importance to communities. Worked with two govt ministries and Microsoft and AWS

@joshuatj Proof of concept in 4 parts - planning including scoping, setting goals, working with partners; preparing the test environments (data sovereignty); preparing the data including identifying a dataset and pre-curation; solution building and testing by technology partners

75-88% accuracy identifying correct 'disposal' class; were also able to find records of interest to match Maori subject headings #FF2024 #FF24 #AI4LAM

@joshuatj closes by saying that Archives New Zealand are looking for partners to continue developing their approach - get in touch! #FF24 #FF2024 #AI4LAM
We have a (re)Peter as Peter L gives Peter Broadwell et al's paper on 'Applying Advances in Person Detection and Action Recognition AI to Enhance Indexing and Discovery of Video Cultural Heritage Collections' - Machine Intelligence for Motion Exegesis (MIME) - a visual corollary to 'distant reading'. Could be used to track changes during the rehearsal process and more #FF2024 #FF24 #AI4LAM

Now Sydney Shep, Kirsten Thorpe in conversation with Roxanne Missingham. Sydney references data sovereignty (image).

Asked about AI, Sydney says technology should not be the driver, it should start with community needs. Kirsten references conversations about distributed community archives; a moment of reckoning for GLAM #FF2024 #AI4LAM #FF24

Also a decentering of power. The reparative description process is important but also lets GLAMs ignore their culpability in colonial projects. Use technology to bring light (like Honiana Love's 'technology as a lens to see more richly' #FF2024 #AI4LAM #FF24

#FF2024 Benjamin Lee and Andrew Dean then Kath Bode on 'Reanimating and Reinterpreting the Archive with AI: Unifying Scholarship and Practice'

Ben and Andrew presented the computer poetry of J M Coetzee

Kath's attempt to use word embeddings to understand Irishness in Australian lit were frustrating by bad OCR (familiar!) and 'perseverations' in Llama-3 - adding text from other prompts into 'corrected' OCR

#FF24 #AI4LAM

#FF2024 Kath Bode didn't find that OCR quality improved as paper quality and typography improved in more modern newspapers (contra some @LivingWithMachines findings)

(Newspapers are so awesome and still *so* hard to work with!)

#FF24 #AI4LAM

@mia @LivingWithMachines My semi regular refrain, OCR is not a solved problem! #AI4LAM #FF2024
@mia I assume it's a typo, but I absolutely agree that everything should start with "community nerds" ๐Ÿค“
@petrichor ๐Ÿซฃ๐Ÿคฃ๐Ÿ˜†
@mia thanks for the toots!
@joshuatj it seemed only fair to share what the other main conference poster was saying!