Thanks #RetractionWatch for another fascinating rabbit hole!
"A Kaggle user broached these same concerns in a comment, and #Piosenka responded that he had not violated privacy restrictions because all images were publicly available, adding “how can one be more ethical than to try to foster early detection and treatment of Autism in children. You sir are way off base.” Piosenka did not respond to The Transmitter’s requests for comment by email and phone."
This dude pulled down photographs of kids faces from various websites and applied labels (methods here https://drive.google.com/file/d/1zMQgyQvYiYyxx9J5jw3jrLGTS0p19Rep/view?usp=drivesdk ) based on whatever the website said about the kids. No consent, no confirmation, no standardization. Then folks used the images and labels to train neural networks and published the results. You could walk across the ocean of red flags here!
His rationale;
"I contacted numerous organizations associated with autism to try to get assistance in developing a higher
quality dataset but was unsuccessful. Also note that there is a possibility that images in the non_autistic folders may have some percentage of autistic images. This is because you cannot search the internet for 'facial images of non autistic children' so I was forced to just gather facial images of children. Some percentage of these children (probably a small percentage) could have autism."
My dude, could it be that you were "unsuccessful" because the approach you used was "unethical"?
Also that dataset was ingested, with labels, into Google AI. There's no way that didn't happen.
Again; thanks Retraction Watch! In my opinion every scientist should read their reports every day. Morning coffee breakfast reading!

Exclusive: Springer Nature retracts, removes nearly 40 publications that trained neural networks on ‘bonkers’ dataset
The dataset contains images of children’s faces downloaded from websites about autism, which sparked concerns at Springer Nature about consent and reliability.

