Ashish Gaurav

95 Followers
194 Following
38 Posts
Tagging @thegradient (I think they are currently running this, but please correct me if I am wrong)
Do you think sigmoid.social should have LaTeX rendering in the interface like Mathstodon? Would be useful to some of us, right?

#introduction

Hi everyone, I am a PhD-CS candidate at Univ. of Waterloo studying reinforcement learning and decision making.

More concretely, I currently work on inverse reinforcement learning to learn constraint functions. I am also interested in reward refinement strategies (eg. reward redistribution / credit assignment). I think that these are two ways to specify human preferences better. Better specification of human preferences can improve the adoption of RL as a practical procedure.