Paolo Maldini: "If I have to make a tackle then I have already made a mistake."
Three tackles in one pipeline. All three exist because pandas can't carry what #BigQuery produces: nested types, nullable integers, timezone-aware timestamps.
The TypeError surfaced on line 40. The mistake was made on line 3. The error couldn't point further back.
#Arrow removed all three. Twelve lines replaced a hundred and fifty, and are more correct.
https://paolobietolini.com/development/a-nan-where-a-long-should-be/

#dataengineering #AnalyticsEngineer

A NaN where a Long should be | Paolo Bietolini

A PySpark TypeError that looked like a schema bug was actually three steps upstream. Pandas can't represent what BigQuery hands it, (nested structs, nullable integers, timezone-aware timestamps, arbitrary-precision numerics) so every downstream line is a patch against a loss that happened at the moment pandas entered the pipeline. A walk to a twelve-line Arrow replacement, and the rule it points at.

Paolo Bietolini
ICYMI: Data Studio is back: Google kills Looker Studio name for good: Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps. https://ppc.land/data-studio-is-back-google-kills-looker-studio-name-for-good/ #DataStudio #Google #LookerStudio #BigQuery #Colab
Data Studio is back: Google kills Looker Studio name for good

Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps.

PPC Land
Data Studio is back: Google kills Looker Studio name for good: Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps. https://ppc.land/data-studio-is-back-google-kills-looker-studio-name-for-good/ #DataStudio #Google #BigQuery #LookerStudio #DataAnalytics
Data Studio is back: Google kills Looker Studio name for good

Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps.

PPC Land

I'm hiring an Analytics Engineer (GCP) to join my team at RHR International, reporting directly to me.

What you'd actually be doing: building and owning our analytics foundation in a GCP-first environment — BigQuery, Looker Studio, Python, SQL, GitHub, Docker. Real production work, version-controlled and documented, not throwaway queries.

RHR is a leadership consulting firm that's been around 80+ years. We're cloud-first, SaaS-only, no on-prem. Small IT team, which means your work matters immediately.

What I'm looking for beyond the technical skills: curiosity, self-direction, and the ability to explain what you built and why to people who don't write code. Bonus points if you've fixed something nobody asked you to fix.

Hybrid in Chicago preferred, remote considered.

Link to apply: https://www.linkedin.com/jobs/view/4399748962/

If you know someone who fits, I'd appreciate the tag or share.

#Hiring #AnalyticsEngineer #GCP #BigQuery #DataEngineering #Chicago #RHRInternational #Google

geoparquet-io: Fast #GeoParquet tool: geoparquet-io is an open-source #CLI tool and #Python library for converting, inspecting, optimizing, and partitioning #GeoParquet files, automatically applying GeoParquet performance best practices along the way. Its extract command can pull geodata from sources such as #WFS, #Esri ArcGIS Feature Services, or #BigQuery into GeoParquet.
https://spatialists.ch/posts/2026/04/06-geoparquet-io-fast-geoparquet-tool/ #GIS #GISchat #geospatial #SwissGIS
geoparquet-io: Fast GeoParquet tool – Spatialists – geospatial news

#geoparquet-io is an open-source #CLI tool and #Python library for converting, inspecting, optimizing, and partitioning #GeoParquet files, automatically applying GeoParquet performance best practices along the way. Its extract command can pull geodata from sources such as #WFS, #Esri ArcGIS Feature Services, or BigQuery into GeoParquet.

Spatialists – geospatial news
Built an end-to-end data pipeline using GCP, Airflow, PySpark, and BigQuery to analyze thermal anomaly data (India 🇮🇳 vs USA 🇺🇸).
Uncovered patterns in fire frequency, intensity, and seasonality through interactive dashboards.
#DataEngineering #GCP #Airflow #BigQuery #PySpark

🌐 In our new blog post, Juan Pablo Bascur introduces the ORION Dashboard: a tool that makes #openresearchdata accessible, interactive, and easy to explore. It enables easy exploration of CWTS #OpenAlex data on #BigQuery, letting users analyse institutions, funders, and research topics via interactive visualisations and reproducible SQL queries, without any coding skills required 👩‍💻.

🔎 Discover the blog post here 👉 https://www.leidenmadtrics.nl/articles/orion-dashboard-bringing-open-research-data-within-reach

ORION dashboard: Bringing open research data within reach

The ORION dashboard enables easy exploration of CWTS OpenAlex data on BigQuery, letting users analyse institutions, funders, and research topics via interactive visualisations and reproducible SQL queries without any coding skills required.

TCO или Полная Стоимость Владение современных подходов в ETL для DB MPP

О чем эта статья : В данной статье я хочу сравнить TCO старых добрых ETL как например Informatica, ODI, MarkitEDM и подобных им vs DBT + AirFlow и подобных им Очень легко проанализировать стоимость лицензий или вычислений и хранения в случае облачной БД, но очень сложно — TCO. Стоимость разработки одной фичи, стоимость поддержки, стоимость сопровождения, стоимость изменений. Очень заманчиво учитывать только расходы на лицензии и вычисления и предполагать, что все остальные расходы одинаковы, хотя это не так. По умолчанию облачные MPP-базы обычно дешевле по хранению и вычислениям и не имеют лицензионной платы, и возникает соблазн использовать такой же безлицензионный подход в ETL, но есть недостатки :

https://habr.com/ru/articles/1014362/

#mppбазы #informatica #dbt #etl #airflow #oracle #bigquery

TCO или Полная Стоимость Владение современных подходов в ETL для DB MPP

О чем эта статья : В данной статье я хочу сравнить TCO старых добрых ETL как например Informatica, ODI, MarkitEDM и подобных им vs DBT + AirFlow и подобных им Очень легко проанализировать стоимость...

Хабр

Secretly convinced BigQuery's main use case is pulling pypi stats

#Python #BigQuery #pypi

Claude Code + BigQuery → agent analityczny, który pracuje na Twoich danych 24/7

Bez kopiowania zapytań. Bez pośredników. Bez przełączania między narzędziami.

To wszystko dzięki połączeniu Claude'a bezpośrednio do BigQuery przez MCP.

#iToSięLiczy
#AI #BigQuery #GoogleCloud #GA4 #DataDrivenMarketing #Automatyzacja #MarketingAnalytics