so, here's a weird request:

I'm really interested in how data is stored on disk for massive databases like MariaDB, PostgreSQL, Redis, etc. unfortunately, any actual resources you can find on designing databases is just about using these existing tools, rather than creating one on your own.

essentially, I'm looking for resources to help understand stuff like:

  • what kinds of data structures are most applicable for storing data on a disk, and when should you separate out data into different files?
  • how do you optimise databases to work well on different filesystems?
  • how do you make the decision on when to keep data in memory versus store it on disk? what kind of caching strategies are best?
  • if you happen to know anywhere that would be good to look, or know someone who might be able to spare some time pointing out what things might be useful to study, let me know!

    @clarfonthey dug around and found this for you: https://wiki.postgresql.org/wiki/Developer_FAQ#What_books_are_good_for_developers.3F

    this seems like it might help if you can get those books

    @KitRedgrave @clarfonthey it has been some time but I took a database systems course that touched on storage management and partitioning data on disk and iirc the book we used was "Database Systems: The Complete Book 2nd Edition", i may have a copy at my house if I do I'll peek and see how much it has on it
    @clarfonthey So it turns out databases and filesystems share a lot of similar concepts. This book about the BeOS filesystem is quite old, but very accessible, and gives a nice intro to how FSes work. Concepts like journaling, logging, BTrees etc. are similar between DBs and FSes, and knowing FS concepts will help to answer some of your questions about optimization: http://static.stevereads.com/papers_to_read/practical_file_system_design_with_the_be_file_system.pdf