Michael Chavinda

@mchav
9 Followers
8 Following
77 Posts
I enjoy conversations about local politics, languages, pan Africanism, and software development.
Threadshttps://www.threads.net/@insta.chav?xmt=AQGzcOdTYpLLpI7Mb0NiisCrqSqOFiQ4Wqgix8LY7UXJU3w

I've been looking into dbt for data engineering and wanted to flesh out what the similarities and differences were to just using {targets} - I learned lots about both of them!

I wrote up my findings in this post

https://jcarroll.com.au/2026/05/04/comparing-r-s-targets-and-dbt-for-data-engineering/

#rstats #dbt #dataeng

Comparing R's {targets} and dbt for Data Engineering

I’m getting more and more into data engineering these days and having used R for a long time, I’m seeing a lot of problems that look nail-shaped to my R-shaped hammer. The available tools to solve those problems exist for (presumably) very good reasons, so I wanted to take some time to dig into how to use them and compare their workflows to what I would otherwise naively do in R.

Irregularly Scheduled Programming

I’ve been thinking a lot about how to scale symbolic regression: this seems like a plausible direction.

https://mchav.github.io/grow-and-mow/

Grow and mow: interpretable models with boosting, symbolic regression and e-graphs

This post is the convergence of two ideas that have been floating in my head for about a year. Can we learn messy stochastic models and use algorithmic/algebraic tools to rein in model complexity to make models interpretable?

The comparison is not favorable.

Perhaps you saw the post series "Python is not a great language for data science"... well, here's

Haskell IS a Great Language for Data Science

https://jcarroll.com.au/2025/12/05/haskell-is-a-great-language-for-data-science/

#haskell  
#rstats 

Haskell IS a Great Language for Data Science

I’ve been learning Haskell for a few years now and I am really liking a lot of the features, not least the strong typing and functional approach. I thought it was lacking some of the things I missed from R until I found the dataHaskell project. In this post I’ll demonstrate some of the features and explain why I think it makes for a good (great?) data science language.

Irregularly Scheduled Programming
Welcome to dataHaskell (revived)!

We’re rebooting dataHaskell! We’ve collected learnings from the previous dataHaskell effort and decided to revive the effort with a simple promise: make doin...

@jonocarroll would appreciate if you joined as an advisor or tastemaker of sorts.

https://datahaskell.org/blog/2025/11/11/welcome-to-datahaskell.html

Welcome to dataHaskell (revived)!

We’re rebooting dataHaskell! We’ve collected learnings from the previous dataHaskell effort and decided to revive the effort with a simple promise: make doin...

Debugging skill level:

🟢 Beginner: print statements
🟡 Intermediate: debugger
🔵 Expert: taking a shower

Wrote a new article where I checkpoint the work we’ve done so far enabling Kaggle style EDA-to-model workflows in Haskell.

https://mchav.github.io/iris-classification-in-haskell/

Progress towards Kaggle-style workflows in Haskell

There’s been a lot of work in the Haskell ecosystem that has made it easier to write interactive Kaggle-like scripts. I’d like to showcase the synergy between 3 such tools: dataframe (my own creation), hasktorch, and IHaskell.

Oh, no! My R package {safespace} is in a broken state - won't someone (new to PRs) help me fix it???

I'm renewing my offer to guide newbies through the R package building / fixing / reviewing process during Hacktoberfest - see this post

https://jcarroll.com.au/2024/10/01/a-safe-space-for-learning-how-to-make-pull-requests/

Open a pull request on https://github.com/jonocarroll/safespace to get a mentored review of your submitted changes with zero risk of breaking anything valuable if you mess it up completely.

Please boost for visibility!

#rstats #hacktoberfest

A Safe Space for Learning How to Make Pull Requests

As October rolls around once more, the term Hacktoberfest might pop across your feeds; an effort aiming to encourage people to contribute to open-source software, particularly if they’re new to that. In this post I’ll describe what I’m offering towards that goal.

Irregularly Scheduled Programming

Have some pretty cool examples of feature engineering using program synthesis on Haskell data frames.

Given a function space, we run a breadth first search to find what functions (and their compositions) have the highest correlation with a target variable.

https://github.com/mchav/dataframe/blob/feature_engineering/app/Main.hs#L32

dataframe/app/Main.hs at feature_engineering · mchav/dataframe

A fast, safe, and intuitive DataFrame library. Contribute to mchav/dataframe development by creating an account on GitHub.

GitHub