In the very last #ISE2024 lecture, we were discussing the limits of machine learning and #AI. The "Paperclip Maximizer Problem" is one of the dystopian scenarios that might occur if final goals and instrumental goals are diverging...

#dystopian #generativeAI #llms @fizise @fiz_karlsruhe #aiart

In our last #ISE2024 lecture before the summer break, we were focusing on artificial neural networks and deep learning. Their foundations already date back to the 1940s, when Warren McCulloch and Walter Pitts presented a mathematical model of the neuron.

W. McCulloch, W. Pitts (1943). A Logical Calculus of Ideas Immanent in Nervous Activity. Bull. of Math. Biophysics. 5 (4): 115–133. https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCulloch.and.Pitts.pdf

#neuralnetwork #AI @fizise @fiz_karlsruhe @sourisnumerique @enorouzi

Summarizing our very brief #HistoryOfAI which was published here for several weeks in a series of toots , let's have a look at the popularity dynamics of symbolic vs subsymbolic AI put into perspective with historical AI hay-days and winters via the Google ngram viewer.
https://books.google.com/ngrams/graph?content=ontology%2Cneural+network%2Cmachine+learning%2Cexpert+system&year_start=1955&year_end=2022&corpus=en&smoothing=3&case_insensitive=false

#ISE2024 #AI #ontologies #machinelearning #neuralnetworks #llms @fizise @sourisnumerique @enorouzi #semanticweb #knowledgegraphs

Google Books Ngram Viewer

Google Ngrams: ontology, neural network, machine learning, expert system, 1955-2022

We are recruiting for the position of a PhD/Junior Researcher or PostDoc/Senior Researcher with focus on knowledge graphs and large language models connected to applications in the domains of cultural heritage & digital humanities.

More info: https://www.fiz-karlsruhe.de/en/stellenanzeigen/phdjunior-researcher-oder-postdocsenior-researcher-wmx-0

Join our @fizise research team at @fiz_karlsruhe
@tabea @sashabruns @MahsaVafaie @GenAsefa @enorouzi @sourisnumerique @heikef #knowledgegraphs #llms #generativeAI #culturalHeritage #dh #joboffer #AI #ISE2024 #PhD #ISWS2024

PhD/Junior Researcher or PostDoc/Senior Researcher (f/m/x) | FIZ Karlsruhe

We are looking for a suitable person for the open position as a PhD/Junior Researcher or PostDoc/Senior Researcher (f/m/x) starting at the nearest possible date.

In 2022 with the advent of ChatGPT, large language models and AI in general gained an unprecedented popularity. It combined InstructGPT, a GPT-3 model complemented and fine-tuned with reinforcement learning feedback, Codex text2code, plus a massive engineering effort.

N. Lambert, et al. (2022). Illustrating Reinforcement Learning from Human Feedback (RLHF). https://huggingface.co/blog/rlhf

#HistoryOfAI #AI #ISE2024 @fizise @sourisnumerique @enorouzi #llm #gpt #llms

Illustrating Reinforcement Learning from Human Feedback (RLHF)

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Higher, faster, farther... in 2021 Generative AI gains momentum with the advent of DaLL-E, a GPT-3 based zero-shot text2image model, and other major milestones, as e.g., GitHub CoPilot, Open AI Codex, WebGPT, and Google LaMDA.

Codex: Chen, M., et al. (2021). Evaluating Large Language Models Trained on Code, https://arxiv.org/abs/2107.03374
DaLL-E: Ramesh, A.et al. (2021). Zero-Shot Text-to-Image Generation, https://arxiv.org/abs/2107.03374

#HistoryOfAI #AI #ISE2024 @fizise @sourisnumerique @enorouzi #llm #gpt

Evaluating Large Language Models Trained on Code

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

arXiv.org

In 2020, GPT-3 was released by OpenAI, based on 45TB data crawled from the web. A “data quality” predictor was trained to boil down the training data to 550GB “high quality” data. Learning from the prompt (few-shot learning) was also introduced.

T. B. Brown et al. (2020). Language models are few-shot learners. NIPS 2020, pp.1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

#HistoryOfAI #AI #ISE2024 #llms #gpt #lecture @enorouzi @sourisnumerique @fizise

In 2019, OpenAI released GPT-2 as a direct scale-up of GPT, comprising 1.5B parameters and trained on 8M web pages.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners.
https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
OpenAI blog post: https://openai.com/index/better-language-models/
GPT-2 on HuggingFace: https://huggingface.co/openai-community/gpt2

#HistoryOfAI #AI #llm #ISE2024 @fizise @enorouzi @sourisnumerique #gpt

In 2018, Generative Pre-trained Transformers (GPT, by OpenAI) and Bidirectional Encoder Representations from Transformers (BERT, by Google) are introduced.

Radford, A. et al (2018). Improving language understanding by generative pre-training, https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

J. Devlin et al (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, ACL 2019, https://aclanthology.org/N19-1423

#HistoryOfAI #ISE2024 #AI #llm @fizise @enorouzi @sourisnumerique

In 2014 Attention mechanisms were introduced by Bahdanau, Cho, and Bengio, which allow models to selectively focus on specific parts of the input. In 2017, the Transformer model introduced by Ashish Vaswani et al. followed, which learns to encode and decode sequential information especially effective for tasks like machine translation and #NLP.

Attention: https://arxiv.org/pdf/1409.0473
Transformers: https://arxiv.org/pdf/1706.03762

#HistoryOfAI #AI #ISE2024 @fizise @sourisnumerique @enorouzi #transformers