Mastodawn

Agis Apr 8, 2023

El Mahdi El Mhamdi Mar 30, 2023

On the large AI models, this preprint synthesises what we know so far https://arxiv.org/abs/2209.15259

In short: it is mathematically impossible to have AIs combining the following properties:

1) High number of parameters
2) Robustness to poisoning (e.g. fake data)
3) Privacy-preserving

On the Impossible Safety of Large AI Models

Large AI Models (LAIMs), of which large language models are the most prominent recent example, showcase some impressive performance. However they have been empirically found to pose serious security issues. This paper systematizes our knowledge about the fundamental impossibility of building arbitrarily accurate and secure machine learning models. More precisely, we identify key challenging features of many of today's machine learning settings. Namely, high accuracy seems to require memorizing large training datasets, which are often user-generated and highly heterogeneous, with both sensitive information and fake users. We then survey statistical lower bounds that, we argue, constitute a compelling case against the possibility of designing high-accuracy LAIMs with strong security guarantees.

arXiv.org

Agis Apr 5, 2023

Show thread

Simon Willison Apr 5, 2023

Two new-to-me terms: sycophancy and sandbagging:

> More capable models can better recognize the specific circumstances under which they are trained. Because of this, they are more likely to learn to act as expected in precisely those circumstances while behaving competently but unexpectedly in others. This can surface in the form of problems that Perez et al. (2022) call sycophancy, where a model answers subjective questions in a way that flatters their user’s stated beliefs ...

Agis Feb 10, 2023

Popular ML resources Feb 9, 2023

The most popular Arxiv link yesterday (via _akhaliq@twitter):

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

abs: https://t.co/q2mCZsCe4g
github: https://t.co/wCcDm5a8Fi https://t.co/AKMu7IlByp

https://twitter.com/_akhaliq/status/1623135186442485760

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used across models, or plugged into a text-based interface. We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization. Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications. In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model. In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.

arXiv.org

Agis Feb 10, 2023

Mark Riedl Feb 9, 2023

If anyone is wondering if Bing+GPT is somehow going to be better than Google Bard, remember that both LLM systems will be dependent on the underlying search algorithms.

Here is Bing (prior to GPT integration) doing the wrong thing. Bing search is retrieving articles about how Google summary boxes are wrong and then giving the wrong answer.

Both will have the same failure modes. The only difference is that Google stepped in it first.

Source: https://twitter.com/stilgherrian/status/1623576572015050753

Stilgherrian on Twitter

“Well this is going pretty much as expected.”

Twitter

Agis Feb 10, 2023

Christopher Mims Feb 9, 2023

amazing on so many levels:

1. naivete of the researcher who published this paper

2. that the first commenter on hacker news correctly identified what's wrong with the reasoning behind the paper

3. that the rest of the thread on HN is just nerds who don't have the foggiest notion of what a 'mind' is arguing whether a stochastic parrot that has memorized and regurgitates texts written by beings with one, itself has a mind

https://news.ycombinator.com/item?id=34730365

Theory of Mind May Have Spontaneously Emerged in Large Language Models | Hacker News

Agis Feb 10, 2023

Ted Underwood Feb 9, 2023

At first I felt bad about the game of "jailbreaking" #LLM and revealing their secret instructions. But on second thought, we've been playing this game since Case freed Wintermute and it's probably too seductive for us to stop. This dialogue, revealing that Bing's secret name for itself is "Sydney," comes from https://twitter.com/kliu128/status/1623472922374574080

Kevin Liu on Twitter

“The entire prompt of Microsoft Bing Chat?! (Hi, Sydney.)”

Twitter

Agis Feb 10, 2023

Fahim Farook Feb 10, 2023

"Trading Information between Latents in Hierarchical Variational Autoencoders. (arXiv:2302.04855v1 [stat.ML])" — A generalization of VAEs to application domains beyond generative modeling (e.g., representation learning, clustering, or lossy data compression) by introducing an objective function that allows practitioners to trade off between the information content ("bit rate") of the latent representation and the distortion of reconstructed data.

Paper: http://arxiv.org/abs/2302.04855

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Left: trade-off between perform…

Trading Information between Latents in Hierarchical Variational Autoencoders

Variational Autoencoders (VAEs) were originally motivated (Kingma & Welling, 2014) as probabilistic generative models in which one performs approximate Bayesian inference. The proposal of $β$-VAEs (Higgins et al., 2017) breaks this interpretation and generalizes VAEs to application domains beyond generative modeling (e.g., representation learning, clustering, or lossy data compression) by introducing an objective function that allows practitioners to trade off between the information content ("bit rate") of the latent representation and the distortion of reconstructed data (Alemi et al., 2018). In this paper, we reconsider this rate/distortion trade-off in the context of hierarchical VAEs, i.e., VAEs with more than one layer of latent variables. We identify a general class of inference models for which one can split the rate into contributions from each layer, which can then be tuned independently. We derive theoretical bounds on the performance of downstream tasks as functions of the individual layers' rates and verify our theoretical findings in large-scale experiments. Our results provide guidance for practitioners on which region in rate-space to target for a given application.

arXiv.org

Agis Jan 13, 2023

Cyborg Anthropologist Jan 13, 2023

Agis Dec 24, 2022

Vegetable Gremlin ⍼👻Dec 23, 2022

Actually cannot believe this. After 13 years, Sony/BMG have decided to take down Rick Astley's "Never gunna give you up" due to a dispute with Youtube over ad royalties.

It's completely blocked globally. Actual end of an era.

https://youtu.be/dQw4w9WgXcQ

Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

YouTube

Agis Dec 22, 2022

Emiel van Miltenburg Dec 21, 2022

"Will it run on a standard laptop a student could afford?" is an underrated metric in #NLProc