A Strategic Community #Roadmap for an #Australian #FAIR #Vocabulary Ecosystem
https://doi.org/10.25911/N6K8-F540
Three years ago, I participated in a very engaged workshop at #ANU on #vocabularies for FAIR #data management. It sharpened how I think about vocabularies. I now see them primarily as a #KnowledgeTransfer tool for representing domain expertise in an actionable form. And I think we do a terrible job both at highlighting how critical they are (particularly in an age where trusted expertise is harder to find) and also at making them easier for others to find and reuse.
I picture this scenario. A student is about to start collecting data for their thesis. They need to make choices about what variables to observe or what questions to ask participants, and they need to think about how they want to represent the results to support their analysis. In the ideal case, the actual data collecting effort is about populating an imagined but initially empty data matrix. If they could be assisted to find the best structured and most widely used (in their domain) vocabularies for any categorical values in their data, it would be possible to generate that template matrix with in-built validation tools, etc. The data they finally collect would have most of its metadata already defined and would be properly interoperable with data collected by others in their domain. Meta-analysis would be much simpler.
I am interested in why tools like this don't really exist, or at least why they are not mainstream. I think it's because vocabularies are seen as such an ultra-nerdy subset of the nerdy topic of #metadata rather than presented as an opportunity to stand on the shoulders of others. What can be done to make them more friendly and intuitive for such purposes?
Finally, after way too many struggles, we have a report and recommendations from from that meeting in 2022. I tried to add some of these ideas to the final product as best I could.
A Strategic Community Roadmap for an Australian FAIR Vocabulary Ecosystem
Vocabularies serve as essential anchor points for both humans and machines in effective and efficient data processing. Vocabularies include controlled vocabularies, taxonomies, thesauri, ontologies, and metadata schemas, each of which contributes to an ecosystem that encompasses the people, resources, standards, tools, platforms, policies, and practices that make them accessible and useful for researchers. Currently, Australia’s vocabulary ecosystem is fragmented and lacks effective coordination. To address these challenges, a Vocabulary Workshop was held in 2022, sponsored by the Australian Data Archive, the Australian Research Data Commons, and CODATA. From this workshop, a proposal for a strategic roadmap emerged, followed by numerous community consultations conducted between 2022 and 2024. The resulting Strategic Community Roadmap outlines a pathway for Australia to transition from its fragmented landscape to a cohesive and dynamic FAIR Vocabulary Ecosystem. It presents a Vision, Mission, and 57 recommendations categorised into seven key topics, organised around four Strategic Themes. Each recommendation is prioritised by its importance and urgency for implementation. The goal is to promote wider adoption and greater community engagement with machine-actionable vocabularies, emphasising the social and technical support needed to address current data interoperability challenges. This serves as a call to action to maximise the societal, economic, and environmental benefits that can be derived from our national research and data initiatives.