Pierluca D'Oro

83 Followers
45 Following
19 Posts

#Introduction v2

I'm generally interested in how the intersection of multi-agent learning, open-endedness, and human-data, can help us build agents with emergent capabilities and reshape validation for autonomous systems. I love batched simulators, RL, autonomous vehicles, and transportation systems (and sci-fi).

I'm also a professor at NYU Tandon CUE and a research scientist at Apple SPG.

It's the "applications!" time of the year, so let me link to these insightful and oh-so-helpful (and witty) slides by Rocco Servedio on how to write a research statement academic positions and postdocs, from the 2021 Learning Theory Alliance mentoring workshop:
https://let-all.com/assets/slides/How-to-COLT-Rocco.pdf

Lots of good stuff, meaningful advice, and Herman Melville.
#academia #academicjobmarket #researchstatement #hermanmelville

@antirez Cu joca sulu vinci.
@fabian @tmlrpub @tmlrcert Thank you so much for creating these, it helps a lot in creating the right vibe!

I've now created @tmlrpub for published papers and @tmlrcert for certifications at TMLR.

This place starts feeling like home 🏡

@psc @jhamrick Thanks for the comments, Pablo! There is indeed a close relationship between value-aware models and bisimulation, and this is an interesting perspective on it!

@psc I'd say huge ones are various forms of ad placement and online bids, through MABs and related algorithms.

The interesting bit is that these might be among the most lucrative applications right now by far, despite not using any deep networks!

Asking for #rl opinions (#2).

What is deep reinforcement for you? Is it just RL with neural networks?

If so, should we call previous work from the 80s/90s deep RL? If not, what are the peculiar features of deep RL?

@saiborg Yes! But shouldn't a value function have a sense of the evolution of a system before doing a prediction about the return?

@jhamrick Yes! I was implicitly referring to value-equivalent/value-aware models.

Since they are are not constrained to be similar to the actual transition model, I sometimes wonder if it is more natural to think of them simply as inducing particular inductive biases (maybe more precisely, learning architectures) for value-based RL, and not really as part of model-based methods.