The @osi released the #OpenSource #AI definition this week, and stopped short of requiring that #datasets used for training AI models also be openly available.

Side-stepping that debate, I dig into the onus the OSI's decision now places on having better #DatasetDocumentation approaches - without which the data used for training cannot be adequately described.

https://blog.kathyreid.id.au/2024/11/02/open-source-ai-definition-dataset-documentation?utm_source=mastodon&utm_medium=social

Earlier this month, I attended and presented at the #ALTA2023 #NLP conference in #Melbourne - here are my #notes from the workshop.

https://blog.kathyreid.id.au/2023/12/10/alta2023/

📷 Gabriela Ferraro, used with permission

#academic #alta2023 #data
#DatasetDocumentation
#linguistics
#NLP
#RightTheDocs

ALTA2023: The 21st Australasian Language Technology Association Workshop - Kathy Reid

My notes from the 21st Australasian Language Technology Association Workshop - ALTA2023

Kathy Reid
ALTA2023: The 21st Australasian Language Technology Association Workshop - Kathy Reid

My notes from the 21st Australasian Language Technology Association Workshop - ALTA2023

Kathy Reid

Help me with my PhD research on voice dataset documentation by completing this survey

Do you work with…

Continue readingHelp me with my PhD research on voice dataset documentation by completing this survey

#data #dataset #datasetdocumentation #documentation #phd #research #survey #writethedocs

https://blog.kathyreid.id.au/2023/04/25/help-me-with-my-phd-research-on-voice-dataset-documentation-by-completing-this-survey/

Help me with my PhD research on voice dataset documentation by completing this survey - Kathy Reid

Do you work with voice or speech data? You might contribute data, write data specifications for collection, perform filtering or pre-processing, train ASR or TTS models, or design or perform evaluations on ML speech models. If so, I’d love your help to understand current dataset documentation

Kathy Reid