Just want to point out that one the coolest thing about our BINDI effort is that we are actively looking at biases encoded in the data used to train LMs.
For more details check this AACL paper (https://aclanthology.org/2022.aacl-main.55/), joint work with Shachi Dave, Vinodkumar Prabhakaran, Sunipa Dev, and Partha Talukdar
Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, Vinodkumar Prabhakaran. Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2022.
For those not wanting to head to the birdsite, here is the text from the tweet linked above that announced our project:
As India’s digital journey becomes powered by AI, it's crucial that we consciously invest in efforts to ensure that real-life societal biases don’t manifest in tech.
Introducing Project BINDI:
🔴Bias
🔵Interventions for
🟢NLP and
🟠Data in the
🟣Indian context
Also read about exciting research other members from my team are pursuing for developing better language technologies for India.
https://twitter.com/partha_p_t/status/1604908136267419649?t=4wIdGyyL5YmdjfNNzxg6CQ&s=19
“Super excited about 3 projects my group at Google Research has been working on getting announced at #GoogleForIndia #1 Vaani: collaboration with @iiscbangalore to capture India’s speech landscape by collecting and open sourcing speech from 773 districts https://t.co/fXx94xwgfV”
The project I work on got announced in the #GoogleForIndia event yesterday! I am so proud of being a part of this super important work on creating fair and inclusive language technologies with special attention to the socio-cultural context of India.
https://twitter.com/GoogleIndia/status/1604737600035880961?t=CJo9tKQTVLnDwNn1OnAcPA&s=19
“As India’s digital journey becomes powered by AI, it's crucial that we consciously invest in efforts to ensure that real-life societal biases don’t manifest in tech. Introducing Project BINDI: 🔴Bias 🔵Interventions for 🟢NLP and 🟠Data in the 🟣Indian context #GoogleForIndia”