Manuel Chevalier

@ManuelChevalier
51 Followers
720 Following
44 Posts
Ex-academic data coach helping people who work with data build confident, independent skills in R, using AI to assist rather than replace you. Founder of DataSharp Academy. Here to make data analysis feel natural, doable, and shareable.
DataSharp Academyhttps://datasharpacademy.com
Newsletterhttps://datasharpacademy.com/newsletter?utm_source=mastodon&utm_campaign=newsletter
GitHubhttps://github.com/mchevalier2
ORCIDhttps://orcid.org/0000-0002-8183-9881

Most people approach data analysis backwards.

They first ask:
“What method should I use?”

But strong analysis starts earlier:
- understanding the dataset,
- spotting patterns,
- questioning assumptions,
- figuring out what the data can realistically answer.

EDA is not a checkbox before modelling.

It is the foundation of the analysis itself.

https://datasharpacademy.com

3/ Sometimes, a linear model or a PCA already tells you enough to guide the next step.

Simpler approaches often make patterns easier to see.

Complexity has its place.
But using it too early often masks problems instead of solving them.

2/ But a fancy model won’t magically compensate for poor understanding of your dataset.

People spend hours tuning models before understanding:
- the structure of the data
- the variables
- the assumptions
- or what the dataset can realistically answer

A fancy model won’t magically compensate for poor understanding of your data.

A VERY common mistake in data analysis is jumping to complex solutions too early. AI has made this even easier by making sophisticated tools extremely accessible.

Sometimes, a linear model or a PCA already tells you most of the story.

Complexity has its place.
But using it too early often masks problems instead of solving them.

Often, the real bottleneck is not the model.
It’s how little we understand our data.

Most people think exploratory data analysis means loading a dataset and making a couple of plots.

But EDA is not about producing outputs.

It is about learning how your data behave:
- what looks suspicious,
- what moves together,
- what is missing,
- what your data can actually answer.

Otherwise, you risk solving the wrong problem with the wrong method.

You load a dataset.

Sometimes you don’t know where to start.
Other times you think you do… until nothing makes sense.

Your problem is a lack of structure.

Start simple to make things click:
• What are you trying to answer?
• Can your data support it?
• What are the key steps?

https://datasharpacademy.com

Thinking “I’ve seen this before, I know how to handle this” is often where the problem starts.

You stop looking at the data, and start fitting it into what your expectations.

And just like that, the analysis is already biased.

Never forget each project demands specific tools and care.

Have you ever found yourself just "doing stuff" with your data?

You were asked to analyse them, so here we are.
RStudio is open. Some fancy graphs.
It looks like work.

But what are you _really_ trying to do?
How should you analyse them?
And most importantly, why?

That's the most overlooked part of data analysis.

The "why" is the question you are trying to answer.
It’s what separates "looking at your data" from "analysing your data".

Find your why.
Then you can start analysing.

When you mix everything together, nothing makes sense.

R, RStudio, packages, functions, scripts…
Different layers. Different roles.

Without a mental map:

* your work feels random
* hard to reproduce
* hard to trust

We fix that at DataSharp Academy.

Next newsletter breaks it down ↓
https://datasharpacademy.com/newsletter?utm_source=mastodon&utm_campaign=newsletter

Newsletter – DataSharp Academy

A newsletter that brings structure and confidence to the chaos.

datasharpacademy.com