On the Road
On Tuesday, as part of the @Textplus plenary at @unigoettingen, our FID showcased a poster and gave a presentation on digital age text corpora.
On the Road
On Tuesday, as part of the @Textplus plenary at @unigoettingen, our FID showcased a poster and gave a presentation on digital age text corpora.
For Immediate Release, April 1, 2025: University of Michigan Press will publish all of the content on Meta platforms as a series of printed books.
https://www.linkedin.com/posts/charles-watkinson-7553a257_amphibians-and-reptiles-of-the-great-lakes-activity-7312775744932179968-sLSu
#MetaPlatforms #Instagram #Facebook #ThreadsApp #Copyright #BookPublishing #TextMining #TextCorpora #WebScraping #AIethics
For Immediate Release, April 1, 2025: University of Michigan Press will publish all of the content on Meta platforms as a series of printed books. University of Michigan Press, a medium-sized publisher of scholarly monographs and regional studies such as Amphibians and Reptiles of the Great Lakes Region (https://lnkd.in/gS69nebW) has embarked on an ambitious new project. Over the next 94 years, the Press will stop publishing award-winning books in the humanities and social sciences. It will instead devote all its efforts to republishing all the content on Facebook, Instagram, Threads, and WhatsApp in print. "We initially thought of asking Meta if they minded us using their copyrighted content," said Charles Watkinson, director of the Press. "But there was someone on WhatsApp who said that it would probably take, like, 4 weeks for Meta to send us all their stuff in PDF format. Also we didn't know Meta's address so we thought it would be better to take the harvesting route." The Press is collaborating with Prestige Nail and Spa in Singapore to acquire the content. Qinfan Banquan, director of operations, preemptively apologized for potential issues. "Our bots can go a bit wild sometimes. If you find that your Instagram is not working for a few days, it's probably a rogue crawler trying to download the same post about Pink Pony Club six million times a second." When can customers expect the Press's "Meta" Series to be available? "To be honest, we're not quite sure," said Watkinson. "We've had a bit of a problem with Amazon where all the covers we designed have been replaced by pictures of underwear. But we hope that the first of the six billion volumes will appear in a year or so." Watkinson also noted that the project was fairly expensive for the non-profit Press. He estimated that complying with new 2025 European requirements around accessibility, environmental stewardship, research integrity, user privacy, and capybara wellness would cost the Press around $500 per book. And building the nuclear power plants that will power the servers as well as harvesting all the forests in Michigan demanded some new skillsets from Press staff - even thought he noted they were quick learners. "Thank goodness we didn't have to compensate Meta's authors, however. Having to bother with those meat slabs would really suck."
In the last #ise2023 lecture, we've tackled POS-tagging with the help of Hidden Markov Models. Again, we are leveraging probabilities gained from large text corpora (Maximum Likelihood Estimation), making some simplifications (Markov assumption, mutual independency assumption) to come up with approximations of emission and transmission probabilities for tag sequences
I'm a #Histodon based in Berlin (Humboldt-University) doing #DigitalHumanities.
Toots and boosts in English and German.
#AntiFascist, #queer and pro #OpenData and #OpenAccess: “Sharing isn’t immoral — it’s a moral imperative”
Aaron Swartz, Guerrilla Open Access Manifesto, 2008.
https://openbehavioralscience.org/manifesto/
My interests include #GIS, #SemanticWeb / #LinkedData, #TextCorpora —
'We have nothing to lose but our inexperience!'
I also have a blog'ish website:
https://schoeneh.eu