I released three new LLM plugins this morning:
- llm-gpt4all adds 17 models from the amazing https://gpt4all.io/ project - https://github.com/simonw/llm-gpt4all
- llm-mpt30b adds the MPT-30B model (a 19GB download) - https://github.com/simonw/llm-mpt30b
- llm-palm adds support for Google's PaLM 2 model, via their API - https://github.com/simonw/llm-palm
We're coming to the end of a major digitization project to image about 100 of our Judaica manuscripts. I took the time to write out what it takes to digitize lots of manuscripts (spoiler: a lot of time and effort on the part of a lot of people) and how this project led us to hire a short-term conservator.
Image from KTIV showing Columbia's digitized manuscripts At Columbia University Libraries, and at the Norman E. Alexander Library in particular, one of our major goals is to provide access to materials to as broad a user base as possible. With one of the largest Judaica manuscript collections in
As @miriamkp once memorably put it, "It's just awful trying to find a humanities dataset." So here are some, available as an R package, which I used to teach Data and Culture with @mlmcgill last fall:
https://github.com/agoldst/dataculture/
Discussion, with intemperate remarks about various subjects: https://andrewgoldstone.com/blog/dataculture/
Course materials, including labs in R:
https://dc22.andrewgoldstone.com/slides
People whose data and work I pirated^H^H^H^H^H reverently built on: @kjhealy @TedUnderwood @riddella and others not here