1.5K Followers
35 Following
199 Posts

Spring 2026 @CMUDB Seminar Series: PostgreSQL vs. The World → https://db.cs.cmu.edu/seminars/spring2026/

Starting Mon Feb 2nd @ 4:30pm EST over Zoom. We will alternate between a speaker from either a Postgres DBMS or a non-Postgres DBMS. Open to the public. Videos available on YouTube afterwards.

SCHEDULE
‣ Feb 2: Redpanda Oxla
‣ Feb 9: Amazon Aurora DSQL
‣ Feb 16: TopK
‣ Feb 23: Microsoft Azure HorizonDB
‣ Mar 9: turbopuffer
‣ Mar 16: YugabyteDB
‣ Mar 23: TonicDB
‣ Mar 30: PixelTable
‣ Apr 6: SpacetimeDB
‣ Apr 13: Supabase Multigres
‣ Apr 20: VillageSQL

Congratulations to the #1 ranked @CMUDB PhD student Wan Shen Lim for successfully passing his doctoral defense. Wan has been working on hard AF database research with me for the last *nine* years (undergrad+grad) at CMU. He also hates chickens.

Next Monday Sept 22nd is the start of @CMUDB latest seminar series: Future Data Systems
We are hosting speakers from leading systems in the data lake / lakehouse space.

Mondays @ 4:30pm ET via Zoom. Open to the public. Videos posted to YouTube: https://db.cs.cmu.edu/seminars/fall2025/

🗓️ Schedule:
Sep 22: Apache Iceberg
Sep 29: Apache Hudi
Oct 06: MotherDuck DuckLake
Oct 13: SpiralDB's Vortex
Oct 27: SingleStore
Nov 03: Delta Lake
Nov 10: Mooncake
Nov 17: Firebolt
Nov 24: XTDB
Dec 01: Apache Polaris

Today is the new semester for @CMUDB's Intro to Database Systems! We're going harder into material than ever before. Projects are more challenging but you can use LLMs to help. We also have 10min talks each Wed from leading DB companies. Follow from home/prison on YouTube: https://15445.courses.cs.cmu.edu/fall2025

Everything is available for free to non-CMU students:
• Lectures on YouTube: https://www.youtube.com/playlist?list=PLSE8ODhjZXjYMAgsGH-GtY5rJYZ6zjsd5
• Slides + Notes + Homeworks on course website.
• Project source code on GitHub: https://github.com/cmu-db/bustub
• Grading with Gradescope (see FAQ ➡️ https://15445.courses.cs.cmu.edu/fall2025/faq.html#q7)

Special thank you to our Affiliate companies for their support this academic year:
• ClickHouse
• DataStax
• dbt Labs
• Firebolt
• MotherDuck
• RelationalAI
• SingleStore
• SpiralDB
• PingCAP / TiDB
• Yellowbrick
• Yugabyte

Do you hate SQL and wish it would die & burn in hell? Or do you love SQL and wish it ran faster? If you answered 'yes' to either question then join our Spring 2025 @CMUDB Seminar Series: SQL or Death?
Mondays @ 4:30pm via Zoom.
Videos posted to YouTube: https://db.cs.cmu.edu/seminar2025/

Seminar Schedule:
Feb 10: Convex
Feb 17: The Germans (TUM)
Feb 24: Apache Pinot
Mar 03: Malloy
Mar 10: Google SQL Pipes
Mar 24: PRQL
Mar 31: StarRocks
Apr 07: Oxide OxQL
Apr 14: MariaDB
Apr 21: EdgeDB

SQL or Death? Seminar Series - Spring 2025 - Carnegie Mellon Database Group

Suppose somebody has been rubbing gasoline on their body since the 1970s.... Read More +

Carnegie Mellon Database Group

VLDB'24 Paper #2: This is our next generation ML tuning algorithm for databases. Instead tuning a single part of the DBMS (knobs, indexes) one-at-a-time, Proto-X tunes *everything* all at the same time! Tuning takes a little longer but achieves beyond human performance.
• Code: https://github.com/17zhangw/protox
• Paper: https://www.vldb.org/pvldb/vol17/p3373-zhang.pdf

Proto-X leans similarities between tuning options and exploits them. For example, INDEX (A,B,C) will have similar pros/cons as INDEX (A,C,B). Proto-X uses transformer to encode a DBMS's config into an embedding and find similar embeddings when exploring tuning choices.

Proto-X encodes the config and maps it to high-dim latent space. Then the actor/critic tuner algo selects the next config to try out, learns whether it helps, and refines the selection of the next config in the latent space.

We support tuning nearly everything in PostgreSQL:
• System/table/index/query knobs
• Indexes with types (btree, hash, brin) + INCLUDE (CREATE only, via HypoPG)
• Query hints (via pg_hint_plan)
We don't support destructive actions yet (DROP index, table partitioning).

GitHub - 17zhangw/protox: Proto-X VLDB 2024.

Proto-X VLDB 2024. Contribute to 17zhangw/protox development by creating an account on GitHub.

GitHub

VLDB'24 Paper #1: Collecting training data for ML models with DBs is $$$/slow. @capybara's Boot framework uses PostgreSQL extensions to cutoff redundant queries.
• Code: https://github.com/lmwnshn/boot
• Paper: https://www.vldb.org/pvldb/vol17/p3680-lim.pdf

Macro-Accelerator: Skip entire queries and send back cached result if plan is similar to past queries. It identifies similar queries based on parameterized query plans. Happens automatically in Postgres without changing application code.

Micro-Accelerator: Watch tuples moving between plan operators to identify redundant behavior. It hijacks Postgres' query cancellation feature to cutoff portions of query plan without killing entire query. Also performs operator-level tuple sampling.

The results are stunning! Running DSB (scalefactor=10) on PostgreSQL v15 goes from 57hrs to 15min! Other workloads go from weeks to hours! Experiments show using two accelerators together achieves the best results. Model accuracy degradation is negligible (~10%).

GitHub - lmwnshn/boot

Contribute to lmwnshn/boot development by creating an account on GitHub.

GitHub

We are pleased to announce the @CMUDB Fall 2024 schedule for the "Database Building Blocks" seminar series! It will feature speakers from leading DBMSs built from open-source components: https://db.cs.cmu.edu/seminar2024/

Mondays @ 4:30pm ET via Zoom (open to public).
Videos posted to YouTube

Sep 23: Apache DataFusion
Sep 30: Apache DataFusion Comet
Oct 07: ParadeDB
Oct 21: VoltronData's Theseus
Oct 28: WhereTrueTech's Exon
Nov 04: Synnada
Nov 11: InfluxDB
Nov 18: GlareDB
Nov 25: GreptimeDB
Dec 02: Databend's OpenDAL

Database Building Blocks Seminar Series - Fall 2024 - Carnegie Mellon Database Group

Like a hobo putting together a sandwich in a parking lot using... Read More +

Carnegie Mellon Database Group

After three years of writing, our follow-up to the classic 2006 paper is finally out! In "What Goes Around Comes Around… And Around…", Stonebraker and I examine the last 20 years in databases and talk about why relational DBs are going to reign supreme.

https://db.cs.cmu.edu/papers/2024/whatgoesaround-sigmodrec2024.pdf

I'm sad to announce that @OtterTune is officially dead. Our service is shutdown and we let everyone go today (1mo notice). I can't got into details of what happened but we got screwed over by a PE-backed Postgres company on an acquisition offer.

We saw huge improvements for customers through ML-based tuning. OtterTune worked better in real world than in the lab. But we struggled with on-boarding and making the product sticky. There were also rumblings in the last year about whether using LLMs would be better for tuning...

On behalf of my co-founders Dana + Bohan, we thank our hardworking team over the last 4yrs. We also appreciate the guidance of our investors IntelCapital (Nick Washburn + Assaf Araki) & RaceCapital (Alfred Chuang). I look forward to working with them again.

https://ottertune.com

OtterTune R.I.P.

OtterTune was an automated database tuning service start-up out of Carnegie Mellon University. It is dead.