Question for the digital humanities people:

Is there any good #OpenSource graphical tool for natural language processing that is both easy to use and performs a reasonable number of analyses?

I am looking for something that the average lawyer or student with a couple of weeks training could operate.

Thanks!

#NLP #DigitalHumanities

@seanfobbe can you specify what kinds of analyses you are after? I suppose you're not referring to corpus linguistics workbenches like #voyanttools or #TXM , but more like the things you can make #spaCy do?

@seanfobbe still more on the (frequentist?) corpus linguistics side, there is an overview at https://corpus-analysis.com/.

Here is my list of usual suspects:

- http://textometrie.ens-lyon.fr/
- https://voyant-tools.org/
- https://notes.jan-oliver-ruediger.de/software/corpusexplorer-overview/ Win/Mac (German only)
- https://gate.ac.uk/ plugins for NER etc
- https://nopaque.uni-bielefeld.de/
- https://dh.chinese-empires.eu/markus/
- https://www.ims.uni-stuttgart.de/forschung/projekte/corpus-workbench/
- https://sifnos.iwm-tuebingen.de/ctap/
- https://www.butter.tools/ for Social Scientists (build pipelines with plugins)
- https://www.liwc.app/ focus on psychology with fixed but well-established metrics, acad. licenses 17-140 € (1 m - 3 yrs); see https://www.liwc.app/demo
- https://wordseer.berkeley.edu/ shared server and running on Mac, accepts documents in XML format only (and quite old: latest commit from 2016).
- https://www.sketchengine.eu/ commercial
- https://nlp.fi.muni.cz/trac/noske limited Free and Open Fork (?) of SketchEngine
- https://www.laurenceanthony.net/software.html AntConc + Family
- https://weblicht.sfs.uni-tuebingen.de/weblichtwiki/Main_Page.html

edit: ant, weblicht added

@seanfobbe Of these, Voyant Tools, Corpus Explorer, LWIC seem (to me) to be most user friendly, eventually TXM. LWIC leans on the side of semantics, the others on frequentist functions and basic NLP tasks like PoS tagging, lemmatization, dependency/constituency parsing etc.

For the more NLP/semantic-y things like sentiment analysis, topic modelling and other tasks, I don't know a multi-purpose platform from the top of my head. Perhaps if you mention some of the functions you are interested in, it could ring a bell...

@anwagnerdreas @seanfobbe Wow, this is a great list! And it reminded me to check: Orange now also has a quite nice list of features for the high-level text analysis stuff like topic models and embeddings (less pos tagging and such): https://orangedatamining.com/widget-catalog/
Orange Data Mining - undefined

Orange Data Mining Toolbox

Orange Data Mining
@anwagnerdreas @seanfobbe Oh, I was wrong, it actually does support some basic text processing/tagging: https://orangedatamining.com/widget-catalog/text-mining/preprocesstext/
Orange Data Mining - undefined

Orange Data Mining Toolbox

Orange Data Mining

@anwagnerdreas Thank you so much, what a fantastic list! I was thinking of the more basic NLP methods like keyword-in-context, collocations, sentiment analysis, substring extraction, TF-IDF, perhaps some custom tokenization, n-grams, stemming and lemmatization, maybe some nice basic plots.

I'll go through the list to see what might fit. I vaguel remember I might have seen LWIC before and it prompted this question...

@seanfobbe @anwagnerdreas For that I usually teach AntConc (plus TagAnt), that's relatively straightforward.
@felwert and it's easy to install, does not need an internet connection and was explicitly designed as a free pedagogical tool (its creator made a career in teaching English in Japan, iirc). No open source, though, if that should be important @seanfobbe @anwagnerdreas