#QualitativeData #textanalysis #software #research #academia #academicchatter #opensource
Discover google/langextract, a game-changing Python library for extracting insights from unstructured text with LLMs! #LLM #NLP #TextAnalysis
The google/langextract library leverages Large Language Models (LLMs) to extract structured information from unstructured text, enabling precise source grounding and interactive visualization capabilities. By utilizing Python, developers can integrate this...
#google/langextract #LargeLanguageModels #NaturalLanguageProcessing #PythonLibraries
#5WAnalysis #5W #textanalysis #phânTíchVanBản #5YếuTố
Cần xác định 5Yếu tố: Ai - Gì - Ở Đâu - Khi nào - Vì sao khi phân tích văn bản? Tìm tool giúp phân tích logic, không suy diễn & tự động tìm nguồn kiểm chứng online. Bạn thường dùng phương pháp gì?
(None: Bài đăng gốc thiếu thông tin cụ thể về nền tảng hay ví dụ thực tế)
https://www.reddit.com/r/LocalLLaMA/comments/1p80l7x/how_to_analyse_text_for_5w/
📚 Análisis cualitativo para grandes volúmenes de texto con LLMs
Usá #IA para potenciar tu análisis cualitativo de textos en #RStats mediante la API de Gemini.
Con Ismael Aguayo y Exequiel Trujillo
📅 2 Dic, 14:00–16:15 UTC-3–Online
💵 Estudiantes USD 5 · Académicos USD 10 · Industria USD 15
🔗 https://www.eventbrite.com.ar/e/1962623150676
Wow! #QualiService could be a great resource!
It wasn't obvious to me how to find the transcripts for these doctor-patient interaction data from 4 countries, but if such transcripts are accessible, that's GREAT!
🇪🇺 Want to analyze text from the EU public consultations? EU public consultations are a way in which the EU invites the broader public to publicly comment on upcoming legislation.
📦
I just published a first version of a Python package {eu-consultations} to scrape and extract text from the EU website:
https://github.com/marioangst/eu_consultations
- download consultation data as displayed on the EU's frontend into a validated form
- download associated files (this is the hard part about analysing this data - lots of feedback is in .docx and .pdf files)
- extract text from the files using docling and attach to feedback
You get all data in validated form and possibly stored in huge (sorry for that) JSON files ;).
This package is part of an analysis project on feedback the EU has received via the public consultation process on digital policy we plan to present later this year, but I thought let's make some of the tools we use open source way earlier already.
Useful contribution to discussions in this area, for sure! The results highlight "whether an automated approach that would still require micromanaging and adjusting several variables by the human researcher would, in fact, be more efficient an approach compared to the same tasks performed manually by human labour"
Out of Context! Managing the Limitations of Context Windows in #ChatGPT-4o Text Analyses https://doi.org/10.46298/jdmdh.15090 #DigitalHumanities #TextAnalysis #LLM #ArtificialIntelligence #GLAMR
In recent years, large language model (LLM) applications have surged in popularity, and academia has followed suit. Researchers frequently seek to automate text annotation - often a tedious task – and, to some extent, text analysis. Notably, popular LLMs such as ChatGPT have been studied as both research assistants and analysis tools, revealing several concerns regarding transparency and the nature of AI-generated content. This study assesses ChatGPT’s usability and reliability for text analysis – specifically keyword extraction and topic classification – within an “out-of-the-box” zero-shot or few-shot context, emphasizing how the size of the context window and varied text types influence the resulting analyses. Our findings indicate that text type and the order in which texts are presented both significantly affect ChatGPT’s analysis. At the same time, context-building tends to be less problematic when analyzing similar texts. However, lengthy texts and documents pose serious challenges: once the context window is exceeded, “hallucinated” results often emerge. While some of these issues stem from the core functioning of LLMs, some can be mitigated through transparent research planning.