Dominique de Villepin visé par...
BFM: Dominique de #Villepin visé par une enquête du #parquet #financier concernant des #statuettes #reçues en #cadeau lorsqu'il était au #Quai d' #Orsay
Le #mec, il reçoit un cadeau de 125000€ en nature et il ne se dit pas qu'il va y avoir un #problème #éthique et #moral, voire #juridique...
When does #Iceberg beat #Parquet+projection on #AWSGlue, and when doesn't ?
An end-to-end #ETL PoC on #AWS to find out: producer, #Kinesis, two #Firehose paths, two #Glue jobs, #Athena.
🔮 Spoiler: how the data is read is the key to the choice.
In the article: every choice with its why, plus a few gems from some Glue experience 😄
Sur le notebook, j'ai innové en utilisant #duckdb avec des fichiers #parquet
https://observablehq.com/@pac02/does-the-country-of-birth-affect-the-likelihood-of

A Cross-Country and Cross-Wikipedia Comparison We compare the number of biographies by birth year and country against demographic birth data to estimate the probability of appearing in different Wikipedia language editions, depending on one's country of birth. Research Questions What is the probability of having a biography in different Wikipedia language editions? Does country of birth affect this probability? If so, which birth countries are most overrepresented on Wikipedia? Do different Wikipedia editio
Pavisuelos Granada se consolida como una compañía referente en el sector de la pavimentación técnica y decorativa dentro de la región andaluza. Su actividad se especializa en la aplicación de soluciones basadas en hormigón impreso y pulido, adaptando cada proyecto a las exigencias específicas de resistencia y diseño que requiere el cliente. #Construcciónyreformas #Parquet/Suelosdemadera
🚨 New feature alert!
You can now download BOLD data packages in Parquet format, making it easier to work with large datasets in R and for better interoperability with downstream analytics.
Visit boldsystems.org/data/data-packages to get started.
#Parquet #Interoperability #BOLDSystems
https://bsky.app/profile/boldsystems.bsky.social/post/3mi4s7zvfhr2k

🚨 New feature alert! You can now download BOLD data packages in Parquet format, making it easier to work with large datasets in R and for better interoperability with downstream analytics. Visit boldsystems.org/data/data-packages to get started.

Parquet gave data lakes a common language: columnar layout, good compression, and fast scans. That still works well for classic analytics. But workloads have changed. We now mix wide scans with point lookups, handle embeddings and images, and run on S3-first stacks. On NVMe you want lots of tiny random reads. On S3 you want fewer, larger range requests. A format tuned for one world can feel chatty or slow in the other.
Гайд: Как работать с форматом PARQUET
В прошлом году мы начали публиковать данные в каталоге «Если быть точным» в формате Parquet . Его придумали инженеры Twitter и Cloudera в 2013 году, и сегодня он стал стандартом хранения аналитических данных — его используют Google, Amazon, Netflix и большинство современных data-платформ. В этом гайде мы расскажем, как эффективно работать с данными в формате Parquet с помощью Python.