Sofie Van Landeghem

175 Followers
87 Following
83 Posts

I'm an NLP expert, software engineer and open-source maintainer. I love data, coding, testing and just getting sh*t done.

Through my one-woman company OxyKodit, I implement tailored solutions for any domain. Specifically, I develop Natural Language Processing (NLP) techniques to process unstructured texts to unlock the information in them and turn them into actionable insights and data that can be further processed & integrated in your downstream applications.

Githubhttps://github.com/svlandeg
LinkedInhttps://www.linkedin.com/in/sofievanlandeghem/
Blueskyhttps://bsky.app/profile/oxykodit.bsky.social
Personal sitehttps://oxykodit.com/

šŸ”„ New case study: How GitLab built scalable spaCy pipelines to process a year's worth of support tickets and create actionable insights to better support their community.

https://explosion.ai/blog/gitlab-support-insights

The ultimate guide to optimizing annotation workflows Ā· Explosion

This blog post collects tips and advice for how to build efficient human-in-the-loop data development workflows, break down business problems into actionable annotation steps and make the most of automation and model assistance.

šŸŽ‰šŸ‡§šŸ‡Ŗ noyb win: Following our complaints from 2023, the Belgian data protection authority has ordered four major Belgian news sites
ā–¶ļø to bring their cookie banners into GDPR compliance and
ā–¶ļø imposes potential penalty of €50,000 per day per website
Read more: https://noyb.eu/en/noyb-win-belgian-dpa-settlement-turned-proper-legal-orders-deceptive-cookie-banners
noyb WIN: Belgian DPA ā€œsettlementā€ turned into proper legal orders on deceptive cookie banners

We won a case against Belgian news company Mediahuis. Four major news websites must adapt their currently deceptive cookie banners

noyb.eu
I'm working with someone who uses github copilot and a lot of my feedback boils down to "that's something that people used to do in #python but it's obsolete now" because *of course* that's the sort of feedback I'd give. Of the code that github stole, most of it is always going old because, like, more things happened in the past than in the present. So now we live in a very weird present-future where allegedly-cutting-edge "AI" is telling us to do Python 2.7 idioms like class ClassName(object)
There are two types of blog posts on #Rust:
ā€œI don't like Rust because it different from the only other programming languages I know.ā€ and
ā€œOm nom nom, Rust so speed, and memory safe.ā€

@joxean

Not necessarily. Overfitting happens when the model tries to replicate little artifacts in the training dataset (or dev), making it generalize less well on unseen data.

Usually, a bigger dataset to train should help to AVOID overfitting, as the model would be incentivized (by the loss function) to focus on the overall patterns instead of the small details.

In your case, is the bigger dataset of equal quality as the smaller one? Because that might be another explanation...

@jbaert "goudklompjes van de zee" zou ook niet helpen natuurlijk

People say I shouldn’t shame others for not voting, but fuck that. We have this one lever we get to pull. This one task. If you can’t make that bare minimum requirement to participate in a democracy, I have no goddamn respect for you. You’re not more morally pure for opting out, you’re just a petulant child.

Voting isn’t everything, but it’s the first thing. If you have this right that so many fought and bled for and you don’t use it, shame on you.

Northern Lights Update: Here’s Where You Could See The Aurora Borealis Tonight

The Northern Lights are expected to be the most visible in Canada and Alaska, though states like Wisconsin, North Dakota, Michigan, Montana and Minnesota may also get a chance to see the aurora borealis tonight.

Forbes
It's so interesting how when you're a female leader you're never technical enough until you are and then you're too rigorous and inflexible. Love this labyrinth with no exit for us.