#recrutement #tech #humour #cloud #digital #numérique #data #dataengineer #dataengineering | Gabriel.. C.

Votre offre d’emploi ressemble à un *dictionnaire d’anglicismes* ? - *"Digital transformation"* → *Transformation numérique*. - *"Data scientist"* → *Scientifique des données*. - *"Cloud computing"* → *Informatique en nuage*. *Moralité* : Les candidats *francophones* vous remercieront. Besoin de *désangliciser* vos offres ? Je vous aide. PS : indiquez le salaire et décrivez la mission ! #Recrutement #Tech #Humour #Cloud #Digital #Numérique #Data #DataEngineer #DataEngineering

LinkedIn
#recrutement #tech #précision #stratégie #java #javascript #shell #powershell #numérique #digital #dataengineer #dataengineering #cloud #onpremise | Gabriel.. C.

Voici les 5 confusions *les plus coûteuses* en recrutement tech : 1. *Java* vs *JavaScript* → Langages différents. 2. *Shell* vs *PowerShell* → Environnements différents (Linux vs Windows). 3. *Digital* vs *Numérique* → Le premier est un anglicisme inutile. 4. *Big Data* vs *Mégadonnées* → Le terme français existe. 5. *Cloud* vs *On-premise* → Deux modèles d’hébergement opposés. Exemple : Un client a *évité 3 mauvaises embauches* en clarifiant ces termes. Vous voulez *recruter juste* ? *Évitez ces confusions*. #Recrutement #Tech #Précision #Stratégie #Java #JavaScript #Shell #PowerShell #Numérique #Digital #DataEngineer #DataEngineering #Cloud #OnPremise

LinkedIn

In most tiering systems, cold data is read-only. A GDPR deletion on archived rows means restore-delete-rearchive: a half-day job.

ColdFront: UPDATE or DELETE archived rows with one SQL statement.
HFS Research analyst Ashish Chaturvedi, in Anirban Ghoshal's InfoWorld piece today. ColdFront appears alongside Databricks, Snowflake & EDB on the OLTP/OLAP divide. The only 100% #OpenSource option, with #Postgres as the interface.

📖 https://hubs.la/Q04mQNSw0

#PostgreSQL #ApacheIceberg #DataEngineering

SQLGlot v30 dropped on the day we recorded. They used Claude to refactor Python into MyPyC-compiled code, got rid of Rust, and hit 5x speed improvements. New episode with Toby Mao. https://www.youtube.com/watch?v=cRECaz2PANg
#AIAgents #PythonPerformance #DataEngineering
From Netflix Side Project to the Fastest SQL Parser | Toby Mao (Fivetran)

YouTube
#dataengineering #stratégie #roi #leadership #data | Gabriel.. C.

Le Data Engineering, c’est comme les *fondations d’un bâtiment* : - *Invisible*, mais *indispensable*. - *Silencieux*, mais *puissant*. - *Complexe*, mais *stratégique*. Sans lui, vos données sont *inutilisables*, vos analyses *faussées*, vos décisions *erronées*. Exemple : Une entreprise a *évité une perte de 1M€* en fiabilisant ses pipelines. Vous voulez une *data solide* ? Investissez dans le Data Engineering. #DataEngineering #Stratégie #ROI #Leadership #Data

LinkedIn
#dataengineering #pipelines #humour #tech #nobullshit | Gabriel.. C.

Votre pipeline data ressemble à une *assiette de spaghettis* ? - Des *connexions* dans tous les sens. - Des *traitements* qui s’emmêlent. - Des *erreurs* qui surgissent comme des champignons. Ma recette pour un pipeline *lisse comme des pâtes al dente* : - *Cartographiez* vos flux de données. - *Simplifiez* les étapes inutiles. - *Automatisez* les tâches répétitives. Exemple : Un client a *réduit ses bugs de 70 %* en structurant son pipeline. Besoin d’un *démêleur de spaghettis* ? Je suis là. #DataEngineering #Pipelines #Humour #Tech #NoBullshit

LinkedIn
#dataengineering #qualitédesdonnées #expertise #tech #luxe | Gabriel.. C.

Vos données sont *sèches*, *dispersées*, *inutilisables* ? C’est comme un désert : il manque l’*eau* (la qualité), le *sol fertile* (l’architecture), et les *graines* (les cas d’usage). Ma méthode pour transformer votre désert en *jardin luxuriant* : - *Nettoyage* : Éliminez les données inutiles ou erronées. - *Structuration* : Organisez vos données en pipelines clairs. - *Valorisation* : Extrayez des indicateurs actionnables. Exemple : Un client a *doublé son ROI data* en appliquant ces principes. Vous voulez des *données fertiles* ? Parlons-en. #DataEngineering #QualitéDesDonnées #Expertise #Tech #Luxe

LinkedIn

Module 1 of LLM Zoomcamp is done! 🎉

I turned my original RAG pipeline into an Agent!

I spent these last few days diving deep into Agentic RAG. It's been fascinating to build it step by step. Every time I ask the LLM to learn about something new, I see how it naturally figures out which tools to use, when to search, and how many times to gather info before giving me a solid answer.

What exactly is Agentic RAG?
It’s like giving the AI a brain that can actually act. Instead of just retrieving from a fixed knowledge base, the model decides whether it needs external tools first, gathers what it needs, and then answers. It’s pretty interesting to understand how it actually works behind the scenes!

Why does this matter?
A few days ago I asked for a detailed guide on using the OpenAI Python library with the chat.completion API. The Local LLM called web search multiple times until it had enough context and built something useful from those pieces. Now that I am building these systems, I can finally understand why it does what it does.

💡 Insights from this week:
- Building a static pipeline is a great start, but to make something truly flexible, you need function or tool calling. It lets the LLM look at the question first and decide whether it needs to search a knowledge base before answering.
- I used to think "chunking" was just about breaking up text. Turns out it can reduce token input by 3x! 🤯
- You have to learn how to walk before you run. Starting small, understanding each component manually, and seeing how the pieces fit together… it felt slow at first but worth it. Now I’m able to accelerate with agent frameworks like toyaikit, LangChain, PydanticAI, or OpenAI Agents.
- There is definitely a learning curve with the API syntax. Between the new response API and chat completions, tool responses are structured differently and you have to adjust your code accordingly. Frustrating at times, but also a great way to learn!

Quick takeaway:
It is best to start simple, then add complexity only when needed. Sometimes an agent can burn tokens unnecessarily, so only add that layer if your problem really needs it!

Had a lot of fun with this module and I’m already curious about what’s next. If you’re interested in learning along, this is the full free course Alexey at the Data Talks Club: https://github.com/DataTalksClub/llm-zoomcamp/

Anyone else tinkering with LLM agents lately? What kind of projects are you exploring or trying out? Would love to hear where your journey is heading!

#ai #localai #llm #mastodon #fediverse #buildinpublic #linux #github #aiengineering #DataEngineering

BoxLang 1.14.0 : Query Transformers - Take Full Control of Your Query Results - foojay

BoxLang 1.14.0 ships a lot of exciting features - Dynamic Sets, Ranges, Inner Classes, JSONPath navigation - but one quietly powerful addition will - by Cristobal Escobar

foojay

Meet #OpenAI’s Kepler - an internal AI data analyst that operates across 600+ petabytes of data and 70,000+ datasets daily.

Learn how OpenAI combines MCP, RAG & vector search over platform metadata to power an autonomous agent that can discover datasets, generate complex queries, investigate anomalies, and deliver insights in natural language.

🎬 Watch now: https://bit.ly/4vtmVGF

#AIAgents #DataEngineering #AI