Mastodawn

Launching today: The K-12 AI Infrastructure Platform! https://platform.k12-ai-infrastructure.org/ Making AI better for students and educators.

Learn more about the launch and our plans on the blog post here:
https://drivendata.co/blog/k12-ai-infrastructure-platform-launch

Peter Bull Apr 1

I’m excited to be speaking at Good Tech Summit in DC April 7 with Digital Promise about building AI infrastructure for social impact. https://www.goodtechtogether.org/summit

We’ll share a new program focused on K-12 education and talk about how to invest in the foundations of AI systems: data, models, and benchmarks. We'll explore how these can shape AI development in a field. Please join us!

Peter Bull Feb 4

🎉 Excited to launch this challenge! 🎉 Over a year of data collection, curation, and annotation that we undertook to produce a first-of-its-kind dataset.

Help us build speech models that understand kids 2-5. This gap bottlenecks literacy assessments, language acquisition testing, speech pathology screenings, and any kind of tool that interacts with early learners' speech (which is a lot, since they are not writing yet!). $120k in prizes and huge impact!

https://kidsasr.drivendata.org/

Peter Bull Oct 27, 2025

Great set of events for #SeattleAIWeek this week! Definitely join some if you are in town and let me know if you want to catch up https://luma.com/Seattle-AI-Week-2025

#SeattleAIWeek 2025 · Events Calendar

View and subscribe to events from #SeattleAIWeek 2025 on Luma. Showcasing the PNW as the best place to be in AI. Community-driven. Future-focused. Submit your event now using the + button.

Peter Bull Oct 14, 2025

🚀 New release: cloudpathlib v0.23.0

🥧 Now with Python 3.14 (π) support!
📁 New copy & move methods mean you can reduce usage of shutil 🎉

Check out the full release and docs here:
👉 https://cloudpathlib.drivendata.org/stable/

Peter Bull Oct 13, 2025

Super interesting work on new proposed columnar data file format called F3 with embedded wasm binary to decode the data 🤯 (which obviates the need for 3rd party library support). Favorable comparisons on compression, throughput and random reads to existing formats.

https://db.cs.cmu.edu/papers/2025/zeng-sigmod2025.pdf

Peter Bull Oct 9, 2025

Very cool to see Wikimedia embracing LLM tools and launching a hybrid similarity search API and open source embeddings for Wikipedia! Also supports Q&A style queries.
https://www.wikidata.org/wiki/Wikidata:Embedding_Project

Peter Bull Oct 6, 2025

Interesting to see empirical research coming out for LLMs as education aids. In this study, active use of LLMs helped CS students debug compiler errors. Removing LLM access demonstrated no lasting learning benefit from having had access to it...

https://learninganalytics.upenn.edu/ryanbaker/ICCE2025_paper_28.pdf

Peter Bull Sep 19, 2025

Great opportunity to work on AI in conservation and biodiversity with Roland Kays! In-person in NC, check it out now since it is only open for a week:
https://www.governmentjobs.com/careers/%7B0%7Dnorthcarolina/jobs/newprint/5021239

Job Bulletin

Peter Bull Sep 18, 2025

We just shipped two major features for cloudpathlib ✨📦 ✨ ! First, http support—treat an URL like any other path in Python code (open, read_text, join). Second, compatibility with open and os Python built-ins for seamless transition of legacy code and third-party library support.

https://cloudpathlib.drivendata.org