#DH2025: Xiaoyan Yang &
@fpianz present a
#multilingual #CLS pipeline comparing
#English &
#Chinese #fanfiction. Using
#BookNLP,
#ARMCC &
#LLMs, they address
#CoreferenceResolution, dialogue speaker & character feature extraction 🔥 hot topics at this year's conference. 📚🌏
Mélanie-Becquet et al. introduce BookNLP-fr, a French adaptation of the
#BookNLP pipeline by
@dbamman et al., enhancing
#genre classification with interpretable features and expanding tools for
#distantreading in
#CLS.
https://doi.org/10.48694/jcls.3924 #CCLS24 #NLP #DigitalHumanities #DH
BookNLP-fr, the French Versant of BookNLP. A Tailored Pipeline for 19th and 20th Century French Literature
This paper presents BookNLP-fr: the adaptation to French of BookNLP, an existing NLP pipeline tailored for literary texts in English. We provide an overview of the challenges involved in the adaptation of such a pipeline to a new language: from the challenges related to data annotation up to the development of specialized modules of entity recognition and coreference. Moving beyond the technical aspects, we explore practical applications of BookNLP-fr with a canonical task for computational literary studies: subgenre classification. We show that BookNLP-fr provides more relevant and – even more importantly – more interpretable features to perform automatic subgenre classification than the traditional bag-of-words approach. BookNLP-fr makes NLP techniques available to a larger public and constitutes a new toolkit to process large numbers of digitized books in French. This allows the field to gain a deeper literary understanding through the practice of distant reading.
Journal of Computational Literary StudiesThis week, we announce another article from JCLS 3 (1): Mélanie-Becquet et al. "BookNLP-fr, the French Versant of BookNLP. A Tailored Pipeline for 19th and 20th Century French Literature" (10.48694/jcls.3924). Check it out at:
https://jcls.io/issue/109/info/ #CLS #JCLS #DigitalHumanities #CCLS24 #BookNLPJournal of Computational Literary Studies |
Issue: Issue: 1(3) (2024)