Mastodawn

Paolo Papotti Dec 10, 2022

#ChatGPT does not expose #SQL querying but if you ask very kindly, it exposes some nice (real) structured data

Paolo Papotti Dec 9, 2022

RT @matthiasboehm7
There are two more TU Berlin / @bifoldberlin openings for data management professorships on data integration/cleaning and data science processes. Apply by Jan 15, and join the large and growing data systems team at TU Berlin:

https://tub.stellenticket.de/en/offers/157328/
https://tub.stellenticket.de/en/offers/157330/

Universitätsprofessur - BesGr. W3 (ID: 157328, en) - Offer List - Stellenticket Technische Universität Berlin

Offer: Technische Universität Berlin @ Stellenticket Technische Universität Berlin - Universitätsprofessur - BesGr. W3 (ID: 157328, en)

Paolo Papotti Dec 1, 2022

#ChatGPT from #OpenAI can verify simple factual claims (with great explanations!), but fails on those from the Feverous (#FEVERworkshop ) dataset. The two failing claims can be answered with information in Wikipedia

It does not answer to questions about claims on some (political) topics.
#factchecking

Paolo Papotti Nov 29, 2022

Looking for the right venue for your work on #integrity on social network and media?
Integrity 2023 will be co-located with #WSDM2023 (March 3rd in Singapore) and the #CFP for technical manuscripts and talk proposals is out:
https://sites.google.com/view/integrity-workshop-2023

Paper submission: 15 Jan '23

Integrity 2023

Integrity in Social Networks and Media

Paolo Papotti Nov 10, 2022

In our paper on crowdsourced #contentmoderation and #factchecking at #twitter, we show that regular users can be very effective and fast. More details in the #CIKM22 paper:
https://arxiv.org/pdf/2208.09214.pdf

Indeed, the #birdwatch program keeps going despite the recent changes. I wonder if #mastodon has plans for a similar initiative to fight #misinformation and #disinformation on this platform
@Gargron @Mastodon

Show thread

Paolo Papotti Nov 7, 2022

Work led by M. Saeed during his PhD at #EURECOM on “Employing #Transformers
and Humans for Textual-Claim Verification”. He is defending today (2pm CET)! Ping me if you’d like to attend it remotely #nlproc #nlp #ML #factchecking #crowdsourcing 6/6

Show thread

Paolo Papotti Nov 7, 2022

Finally, we steer text generation with general concepts, e.g., Affection. We generate a vector from words such as love and cheerful. Adding such a vector to prompts that generate toxic text leads to non-toxic output. 5/6
(original group name in the fig. has been replaced with RelG)

Show thread

Paolo Papotti Nov 7, 2022

We also report promising results on text generation. Adding the Year or Country vectors to a #GPT 2 prompt leads to text containing entities for such types. 4/6

Show thread

Paolo Papotti Nov 7, 2022

Given the type embedding (TE), we simply add it to the mask token in the prompt for #factretrieval. The TE steers the inference of the target #token towards the desired type. No training, no #finetuning, just add the new embedding to the input. 3/6

Show thread

Paolo Papotti Nov 7, 2022

Our solution extends prompting in pre-trained #languageModels (#PLMs) to obtain a ``typed’’ output. First, we propose to define types by example. Given “Rome, Paris, New York”, we learn the #embeddings for their latent, shared concept in the PLM. In this case, the City type. 2/6

Homepage	https://www.eurecom.fr/~papotti/index.html
Research Topics	data management, NLP, misinformation
Countries	Italy, France
Papers	https://scholar.google.fr/citations?user=YwoezYX7JVgJ

Uni­ver­si­täts­pro­fes­sur - BesGr. W3 (ID: 157328, en) - Offer List - Stellenticket Technische Universität Berlin

Integrity 2023

Universitätsprofessur - BesGr. W3 (ID: 157328, en) - Offer List - Stellenticket Technische Universität Berlin