Michael Chavinda

@mchav
7 Followers
7 Following
75 Posts
I enjoy conversations about local politics, languages, pan Africanism, and software development.
Threadshttps://www.threads.net/@insta.chav?xmt=AQGzcOdTYpLLpI7Mb0NiisCrqSqOFiQ4Wqgix8LY7UXJU3w
The comparison is not favorable.

Perhaps you saw the post series "Python is not a great language for data science"... well, here's

Haskell IS a Great Language for Data Science

https://jcarroll.com.au/2025/12/05/haskell-is-a-great-language-for-data-science/

#haskell  
#rstats 

Haskell IS a Great Language for Data Science

I’ve been learning Haskell for a few years now and I am really liking a lot of the features, not least the strong typing and functional approach. I thought it was lacking some of the things I missed from R until I found the dataHaskell project. In this post I’ll demonstrate some of the features and explain why I think it makes for a good (great?) data science language.

Irregularly Scheduled Programming
Welcome to dataHaskell (revived)!

We’re rebooting dataHaskell! We’ve collected learnings from the previous dataHaskell effort and decided to revive the effort with a simple promise: make doin...

@jonocarroll would appreciate if you joined as an advisor or tastemaker of sorts.

https://datahaskell.org/blog/2025/11/11/welcome-to-datahaskell.html

Welcome to dataHaskell (revived)!

We’re rebooting dataHaskell! We’ve collected learnings from the previous dataHaskell effort and decided to revive the effort with a simple promise: make doin...

Debugging skill level:

🟢 Beginner: print statements
🟡 Intermediate: debugger
🔵 Expert: taking a shower

Wrote a new article where I checkpoint the work we’ve done so far enabling Kaggle style EDA-to-model workflows in Haskell.

https://mchav.github.io/iris-classification-in-haskell/

Progress towards Kaggle-style workflows in Haskell

There’s been a lot of work in the Haskell ecosystem that has made it easier to write interactive Kaggle-like scripts. I’d like to showcase the synergy between 3 such tools: dataframe (my own creation), hasktorch, and IHaskell.

Oh, no! My R package {safespace} is in a broken state - won't someone (new to PRs) help me fix it???

I'm renewing my offer to guide newbies through the R package building / fixing / reviewing process during Hacktoberfest - see this post

https://jcarroll.com.au/2024/10/01/a-safe-space-for-learning-how-to-make-pull-requests/

Open a pull request on https://github.com/jonocarroll/safespace to get a mentored review of your submitted changes with zero risk of breaking anything valuable if you mess it up completely.

Please boost for visibility!

#rstats #hacktoberfest

A Safe Space for Learning How to Make Pull Requests

As October rolls around once more, the term Hacktoberfest might pop across your feeds; an effort aiming to encourage people to contribute to open-source software, particularly if they’re new to that. In this post I’ll describe what I’m offering towards that goal.

Irregularly Scheduled Programming

Have some pretty cool examples of feature engineering using program synthesis on Haskell data frames.

Given a function space, we run a breadth first search to find what functions (and their compositions) have the highest correlation with a target variable.

https://github.com/mchav/dataframe/blob/feature_engineering/app/Main.hs#L32

dataframe/app/Main.hs at feature_engineering · mchav/dataframe

A fast, safe, and intuitive DataFrame library. Contribute to mchav/dataframe development by creating an account on GitHub.

GitHub

Earlier this year, the second #AIMO (artificial intelligence mathematical olympiad) concluded, with the winning team solving 34/50 in the final set of math problems (that had been selected to be harder for AI than the first AIMO).

The competition was restricted to open source models and run with a limited amoutn of compute. The AIMO has now conducted a retest of these problems both for the top two teams from that competition (NemoSkills and imagination research), as well as OpenAI's o3 model, both with comparable levels of compute resources, and with high resources. Unsurprisingly, the high resource models did better, with the high resource o3 model scoring as high as 47/50, or even 50/50 if given two tries at each question. On the other hand, the gap between the open source models and the commercial models for a fixed amount of compute was relatively slight.

More details of this experiment are available at https://aimoprize.com/updates/2025-09-05-the-gap-is-shrinking

The gap between commercial and open-source LLMs for Olympiad-level math is shrinking | AIMO Prize

AIMO Prize

Starting a series on programming synthesis

https://mchav.github.io/an-introduction-to-program-synthesis/

An introduction to program synthesis

Introduction This post kicks off a hands-on series about program synthesis—the art of generating small programs from examples. We’ll build a tiny, FlashFill-style synthesiser that learns to turn strings like “Joshua Nkomo” into “J. Nkomo” from input/output pairs. We’ll see how to define a tiny string-manipulation language, write an interpreter, and search the space of programs to find one that solves our toy problem.