Mastodawn

Ashish Gaurav Nov 7, 2022

Tagging @thegradient (I think they are currently running this, but please correct me if I am wrong)

Ashish Gaurav Nov 7, 2022

Do you think sigmoid.social should have LaTeX rendering in the interface like Mathstodon? Would be useful to some of us, right?

Ashish Gaurav Nov 7, 2022

#introduction

Hi everyone, I am a PhD-CS candidate at Univ. of Waterloo studying reinforcement learning and decision making.

More concretely, I currently work on inverse reinforcement learning to learn constraint functions. I am also interested in reward refinement strategies (eg. reward redistribution / credit assignment). I think that these are two ways to specify human preferences better. Better specification of human preferences can improve the adoption of RL as a practical procedure.