Tim G. J. Rudner

372 Followers
174 Following
65 Posts
Assistant Professor & Faculty Fellow at NYU.
Probabilistic Machine Learning & RL.
Prev: Oxford + Yale. He/him.
Websitehttps://timrudner.com
Twitterhttps://twitter.com/timrudner

Named Tensor Notation (TMLR version, https://arxiv.org/abs/2102.13196)

A rigorous description, opinionated style guide, and gentle polemic for named tensors in math notation.

* Macros: https://ctan.org/tex-archive/macros/latex/contrib/namedtensor

Named Tensor Notation is an attempt to define a mathematical notation with named axes. The central conceit is that deep learning is not linear algebra. And that by using linear algebra we leave many technical details ambiguous to readers.

Named Tensor Notation

We propose a notation for tensors with named axes, which relieves the author, reader, and future implementers of machine learning models from the burden of keeping track of the order of axes and the purpose of each. The notation makes it easy to lift operations on low-order tensors to higher order ones, for example, from images to minibatches of images, or from an attention mechanism to multiple attention heads. After a brief overview and formal definition of the notation, we illustrate it through several examples from modern machine learning, from building blocks like attention and convolution to full models like Transformers and LeNet. We then discuss differential calculus in our notation and compare with some alternative notations. Our proposals build on ideas from many previous papers and software libraries. We hope that our notation will encourage more authors to use named tensors, resulting in clearer papers and more precise implementations.

arXiv.org

Aalto University is looking for Assistant Professors in Computer Science. Excellent place, with opportunities to work in
@FCAI and with ELLIS. "We welcome applications in all areas of computer science" while means also #ml . https://www.aalto.fi/en/open-positions/assistant-professors-computer-science DL Jan 15, 2023

Feel free to ask for more information from all faculty, including me.

Assistant Professors, Computer Science | Aalto University

May God grant me the confidence of a large language model.

📣 You can now find *V-D4RL*, a benchmarking suite for offline RL from pixels, on #huggingface:
https://huggingface.co/datasets/conglu/vd4rl 🚀

Highlights:
💥 New D4RL-style visual datasets!
💥 Competitive baselines based on Dreamer and DrQ!
💥 A set of exciting open problems!

This is joint work with @conglu, Phil Ball, @jparkerholder, @maosbot, and @yeewhye !

conglu/vd4rl · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Now time for a first research post...

No better time to start on offline RL from pixels! V-D4RL is now on #huggingface at https://huggingface.co/datasets/conglu/vd4rl

💥 New D4RL-style visual datasets!
💥 Competitive baselines based on Dreamer and DrQ!
💥 A set of exciting open problems!

conglu/vd4rl · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

#introduction

Time for a very late introduction, I'm a 4th year PhD student at the University of Oxford interested in deep reinforcement learning, generative modelling, and Bayesian methods!

Most lately, been thinking about effective ways to automate reinforcement learning (PBT, HPO) and how to extend use cases for offline reinforcement learning (learning from pixels, generalizing to unseen environments)!

Always v. v. happy to chat :)

Seasonal reminder: Alcohol is a completely legal drug that destroys lives, relationships and families. Don't make it hard for people who are trying to stay sober to enjoy festivities and functions.

(The author of the image is https://www.madsahara.com/illustration/)

Illustration - Madsahara Art & Design

Illustration projects: personal commissions, editorial images, wedding invitations and many more whimsical art projects!

Madsahara Art & Design
Want to do a PhD in ML? At OATML Oxford we work both on core ML methodology, as well as on applications of ML in lots of interesting domains. Deadline for fully funded studentships is Friday 9 December 2022
https://www.ox.ac.uk/admissions/graduate/courses/dphil-computer-science
DPhil in Computer Science | University of Oxford

About the courseThe DPhil in Computer Science is an advanced research degree, awarded for significant (new) contribution to the existing body of knowledge in the field of computer science. You will work with world-class experts in their field. The DPhil normally takes three to four years of full-time study to complete.

[13/N] Thank you to my co-first author Tim G. J. Rudner (@timrudner), co-authors from OATML and @google and the many other collaborators who made this work possible!
[12/N] For example, in “Plex: Towards Reliability using Pretrained Large Model Extensions” @dustinvtran et al.), we evaluate the performance of pretrained models on RETINA. (https://arxiv.org/abs/2207.07411)
Plex: Towards Reliability using Pretrained Large Model Extensions

A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it improves the out-of-the-box performance and does not require designing scores or tuning the model for each task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex's capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding.

arXiv.org