Sai Prasanna

@saiborg@sigmoid.social
78 Followers
162 Following
108 Posts

🤖 = 42

| M.Sc. CS @ Uni Freiburg| Previously ML Engineer @ Agara, Zoho|

Interests: #robotics #reinforcementlearning #causality #AI #generalization #philosophy #anime #meditation #vegan

Blogsaiprasanna.in
Twitter@sai_prasanna
Githubhttps://github.com/sai-prasanna
Scholarhttps://scholar.google.co.in/citations?user=ZiB7SdEAAAAJ&hl=en

MotherNet: A Foundational Hypernetwork for Tabular Classification

by Andreas MĂźller, Carlo Curino, Raghu Ramakrishnan
https://arxiv.org/abs/2312.08598

Crazy paper that introduces a meta trained transformer that can perform In-context Learning of the weights of small MLPs from numerical tabular training sets passed to the 'prompt' of the big transformer.

A kind of TabPFN but with very fast inference.

MotherNet: Fast Training and Inference via Hyper-Network Transformers

Foundation models are transforming machine learning across many modalities, with in-context learning replacing classical model training. Recent work on tabular data hints at a similar opportunity to build foundation models for classification for numerical data. However, existing meta-learning approaches can not compete with tree-based methods in terms of inference time. In this paper, we propose MotherNet, a hypernetwork architecture trained on synthetic classification tasks that, once prompted with a never-seen-before training set generates the weights of a trained ``child'' neural-network by in-context learning using a single forward pass. In contrast to most existing hypernetworks that are usually trained for relatively constrained multi-task settings, MotherNet can create models for multiclass classification on arbitrary tabular datasets without any dataset specific gradient descent. The child network generated by MotherNet outperforms neural networks trained using gradient descent on small datasets, and is comparable to predictions by TabPFN and standard ML methods like Gradient Boosting. Unlike a direct application of TabPFN, MotherNet generated networks are highly efficient at inference time. We also demonstrate that HyperFast is unable to perform effective in-context learning on small datasets, and heavily relies on dataset specific fine-tuning and hyper-parameter tuning, while MotherNet requires no fine-tuning or per-dataset hyper-parameters.

arXiv.org

A smart take on why people are calling Google’s Web Environment Integrity proposal “DRM for the web”. It is modeled after the Play Integrity API that allows apps like Netflix or bank apps to block access on jailbroken phones.

Google has attempted to argue there are only positive uses of this technology but consumers aren’t guaranteed otherwise. An obvious use case will be websites blocking access to browsers with ad block plugins.

https://www.neelc.org/posts/google-webauth-palladium/

Google Web Environment Integrity is the new Microsoft Trusted Computing

Disclaimer: I work at Microsoft but not on Windows or Edge. I also don’t have a full understanding about Web Environment Integrity, but am basing this off what I understand. This article states my opinions, as opposed to that of my employer. If you haven’t been under a rock, you may have heard about Google’s evil Web Environment Integrity “proposal”. Supposedly, this is to make sure a browser environemnt can be “trusted”, but it seems Google wants this so they can kill ad blockers.

Neel Chauhan

Really cool paper with some theoretical justifications and empirical evidence on why learning a value function from trajectories from a reasonable model is better than just experience replay.

https://arxiv.org/abs/2211.02222v3

The Benefits of Model-Based Generalization in Reinforcement Learning

Model-Based Reinforcement Learning (RL) is widely believed to have the potential to improve sample efficiency by allowing an agent to synthesize large amounts of imagined experience. Experience Replay (ER) can be considered a simple kind of model, which has proved effective at improving the stability and efficiency of deep RL. In principle, a learned parametric model could improve on ER by generalizing from real experience to augment the dataset with additional plausible experience. However, given that learned value functions can also generalize, it is not immediately obvious why model generalization should be better. Here, we provide theoretical and empirical insight into when, and how, we can expect data generated by a learned model to be useful. First, we provide a simple theorem motivating how learning a model as an intermediate step can narrow down the set of possible value functions more than learning a value function directly from data using the Bellman equation. Second, we provide an illustrative example showing empirically how a similar effect occurs in a more concrete setting with neural network function approximation. Finally, we provide extensive experiments showing the benefit of model-based learning for online RL in environments with combinatorial complexity, but factored structure that allows a learned model to generalize. In these experiments, we take care to control for other factors in order to isolate, insofar as possible, the benefit of using experience generated by a learned model relative to ER alone.

arXiv.org

Metal Dashavatar
https://open.spotify.com/album/0RM4EN10mmhUzVCJQw9mgt

Mythology of the ten avatars of Lord Vishnu serialized as metal music! 🤘

Dashavatar

Demonic Resurrection ¡ Album ¡ 2017 ¡ 10 songs.

Spotify
Speaking of enshittification, @pluralistic, how about this: the way if you make the mistake of clicking on a Google search result that takes you to Facebook, you're trapped! All of your browser tab's previous surfing history that led you to Google and then to the FB page, is lost, you can't click the Back button; FB has you like a venus fly trap. It is one of the most violative acts I've ever seen on the web. Just blatantly subverts the very essence of the web. It's an act of pure evil by FB.

i'm gonna reiterate again: if you are not using https://elk.zone as your mastodon client, you're missing out!
it replicates the nice amenities of the twitter interface (like grouping threads, automatica translation, etc) and more. my fav extra feature is that toots under a CW take only the space of the CW.

kudos for the amazing job, @elk!

Elk

A nimble Mastodon web client

Elk

@jamesbridle Thanks @pluralistic for your summary. It helped me find this brilliant book which deeply intertwines the niche of topics I love to ponder about.

Checkout the brilliant short summary here:
https://pluralistic.net/2022/06/07/more-than-human/#umwelt

Pluralistic: 07 Jun 2022 – Pluralistic: Daily links from Cory Doctorow

"Ways of Being" by @jamesbridle

= argmax {books} (Goosebumps)

James pulls an brilliant move of using various examples in technology as a tool to help us rethink our place in nature & reconnect with the (beyond human) world.

He weaves a fascinating tale with myriads of threads such as Cybernetics, Neural Nets, Internet, Random Numbers, Analog computers, Slime molds, Sortition vs Voting, Mycelium, Mysticism in animals, Turing machines, Personhood of non-humans etc