Paolo Papotti

@Papotti
65 Followers
87 Following
34 Posts
Professor in data science at EURECOM
Homepagehttps://www.eurecom.fr/~papotti/index.html
Research Topicsdata management, NLP, misinformation
CountriesItaly, France
Papershttps://scholar.google.fr/citations?user=YwoezYX7JVgJ
#ChatGPT does not expose #SQL querying but if you ask very kindly, it exposes some nice (real) structured data

RT @matthiasboehm7
There are two more TU Berlin / @bifoldberlin openings for data management professorships on data integration/cleaning and data science processes. Apply by Jan 15, and join the large and growing data systems team at TU Berlin:

https://tub.stellenticket.de/en/offers/157328/
https://tub.stellenticket.de/en/offers/157330/

Uni­ver­si­täts­pro­fes­sur - BesGr. W3 (ID: 157328, en) - Offer List - Stellenticket Technische Universität Berlin

Offer: Technische Universität Berlin @ Stellenticket Technische Universität Berlin - Universitätsprofessur - BesGr. W3 (ID: 157328, en)

#ChatGPT from #OpenAI can verify simple factual claims (with great explanations!), but fails on those from the Feverous (#FEVERworkshop ) dataset. The two failing claims can be answered with information in Wikipedia

It does not answer to questions about claims on some (political) topics.
#factchecking

Looking for the right venue for your work on #integrity on social network and media?
Integrity 2023 will be co-located with #WSDM2023 (March 3rd in Singapore) and the #CFP for technical manuscripts and talk proposals is out:
https://sites.google.com/view/integrity-workshop-2023

Paper submission: 15 Jan '23

Integrity 2023

Integrity in Social Networks and Media

In our paper on crowdsourced #contentmoderation and #factchecking at #twitter, we show that regular users can be very effective and fast. More details in the #CIKM22 paper:
https://arxiv.org/pdf/2208.09214.pdf

Indeed, the #birdwatch program keeps going despite the recent changes. I wonder if #mastodon has plans for a similar initiative to fight #misinformation and #disinformation on this platform
@Gargron @Mastodon

Work led by M. Saeed during his PhD at #EURECOM on “Employing #Transformers
and Humans for Textual-Claim Verification”. He is defending today (2pm CET)! Ping me if you’d like to attend it remotely #nlproc #nlp #ML #factchecking #crowdsourcing 6/6
Finally, we steer text generation with general concepts, e.g., Affection. We generate a vector from words such as love and cheerful. Adding such a vector to prompts that generate toxic text leads to non-toxic output. 5/6
(original group name in the fig. has been replaced with RelG)
We also report promising results on text generation. Adding the Year or Country vectors to a #GPT 2 prompt leads to text containing entities for such types. 4/6
Given the type embedding (TE), we simply add it to the mask token in the prompt for #factretrieval. The TE steers the inference of the target #token towards the desired type. No training, no #finetuning, just add the new embedding to the input. 3/6
Our solution extends prompting in pre-trained #languageModels (#PLMs) to obtain a ``typed’’ output. First, we propose to define types by example. Given “Rome, Paris, New York”, we learn the #embeddings for their latent, shared concept in the PLM. In this case, the City type. 2/6