jauntywundrkind 13 hours ago

Nice having this backstory (fantastic production value too, impressive start to this podcast). Dis-aggregating the responsibilities of the DB into multiple pieces just feels so logical, helps make sure each piece can scale. Deterministic Simulation Testing gets mentioned in the video & was way ahead of it's time here. https://apple.github.io/foundationdb/testing.html

Hacker News is here too! From July 2012 (78 points, 72 comments): https://news.ycombinator.com/item?id=4294719

For a general introduction, I enjoyed the recent submission How FoundationDB works and why it works: https://news.ycombinator.com/item?id=37552085 https://uvdn7.github.io/notes-on-the-foundationdb-paper/

romanhn 10 hours ago

Posted about this in the past, but what really got FoundationDB on my radar was a demo at a developer conference, back in 2014-ish. They had the database running across a bunch of machines, with a visual showing their health and data distribution. One team member would then be turning machines on and off (or maybe unplugging them from the network) and you could see FDB effortlessly rebalancing the data across the available nodes. It was a very striking, impressive presentation (especially as we were dealing with the challenges of distributed Cassandra at the time).

The beginning of this video has some of that: https://youtu.be/Nrb3LN7X1Pg

  • AtlasBarfed 8 hours ago

    So ... cassandra does that? I get the FDB demo probably made it look better and easier.

    But data doesn't teleport except in demos. Rebalancing means streaming data across a network, consuming total network I/O, regardless of the distributed database.

    Did you actually implement FDB, and was it better?

    • jwr an hour ago

      Comparing Cassandra to FoundationDB is like comparing a spreadsheet in Google Sheets to PostgreSQL.

      I mean, both kind of store data, and multiple users can change the data that is being stored. The story of what you'll get back and when (if ever), however, is rather different.

      I would respectfully suggest that anyone that wants to comment in distributed database discussions should be familiar with https://jepsen.io/consistency/models and https://antithesis.com/resources/reliability_glossary/ and use the wording found there.

      If your eyes gloss over, because there is a lot of complex stuff there, it is likely that your comments will not have much value.

    • rapsey 6 hours ago

      Many years later did cassandra get reliable. Fdb was the gold standard that set the bar. They did not need jepsen tests to implement it properly.

vlovich123 11 hours ago

What a great story and really interesting courage to double-down on improving the testing even when a critical flaw that testing should have found was found. Wish that they had managed long enough for Snowflake to keep them alive, but then we wouldn't have Antithesis as a service so silver lining.

  • tptacek 10 hours ago

    By "keep them alive", you mean the team, right? People are definitely still using FDB!

    • vlovich123 9 hours ago

      The team pushing forward the vision. Using FDB is a fraction of the vision if you listen to them.

      • tptacek 7 hours ago

        Makes sense, thanks! Antithesis is pretty neat, though (we use it for a distributed system thing here).

pjmlp 3 hours ago

Nowadays being rewritten into Swift.

"Swift as C++ Successor in FoundationDB" by Konrad Malawski (Strange Loop 2023)

https://www.youtube.com/watch?v=ZQc9-seU-5k

  • jen20 3 hours ago

    That was an experiment the team didn’t end up committing to - it’s been backed out. That said it was a fascinating dive into the flexibility of Swift, and the Konrad’s talk is excellent and worth watching.

    https://github.com/apple/foundationdb/commit/e52fc3621fd5e41...

    • pjmlp 3 hours ago

      Interesting, thanks for sharing, is there a rationale somewhere?

      As someone that enjoys using C++ despite all its warts, I can imagine a few reasons, but would nonetheless an interesting read, in case that is public.

      I guess that experience might also had an impact on ongoing Swift 6+ features.

msy 9 hours ago

Does anyone know how widely FoundationDB is now being used at Apple? I know they run a huge Cassandra cluster, does this serve a different use case?

  • ethan_smith 2 hours ago

    Apple uses FoundationDB extensively for iCloud services including CloudKit, with public documentation confirming it handles billions of operations per second across their infrastructure.

  • ntqz 4 hours ago

    My understanding is that CloudKit runs on it.

gregoriol 2 hours ago

So if it has been acquired by Apple, it's a failure, isn't it? Most things acquired by Apple get unmaintained or change completely, or disappear. Being "open-source" here doesn't bring any guarantees to any third-party user about maintenance or long-term life. It should be a serious no-go indicator for anyone willing to build something with it.

Nican 10 hours ago

FoundationDB has been growing as my favorite database lately. Even though it is only key-value store.

Out of curiosity: what are the scale limits of FoundationDB? What kind of issues would it start to have? For example, being able to store all of Discord messages on it?

I see blog posts of Discord moving to Scylla and ElasticSearch, but I wonder if there would be any difficulties here.

piokoch 3 hours ago

I've looked on FoundationDB and on paper it looks great. But it never got momentum, like, say, MongoDB. Is this just a matter of hype or it is not that great as advertised?

  • jwr an hour ago

    It is difficult to use by itself: the "foundation" in the name describes it quite well. It is a foundation that you build a database on. It fits my use case very well, for example, because I know my data model and usage patterns very well and I can integrate deeply with the database, but it's not a good match by itself for quick-and-dirty apps.

    It provides fantastic (strict serializable) consistency guarantees in a distributed database, which is extremely rare. It is a huge advantage, but sadly most people do not understand how badly most distributed databases are broken and don't even understand the concepts (https://antithesis.com/resources/reliability_glossary/) well enough to talk about the issues involved. See every discussion where someone mentions ACID.

    It's hard to compete for mindshare when the concepts are difficult and every other database has a warm-and-fuzzy-feeling website saying that everything will be great (it usually won't).

    Personally, I hope more people will start using it, and I hope to see more easy-to-use databases built on top of it (that's what it was designed for, really). In my experience with it, working with a fast distributed database that gives you strict serializable semantics right in your code is fantastic.

  • chrischen 3 hours ago

    I think it wasn't as easy to use or get started with. There was a MongoDB compatibility layer but it wasn't maintained.

  • qcnguy 2 hours ago

    Many reasons.

    FoundationDB started development in the same year MongoDB launched but took nearly four years to reach the market. It's the rarely discussed dark side of great testing - you can end up with robust code nobody cares about because it arrives years after people decided they wanted it. Everyone went with what existed and learned to deal with its quirks. In this case they got lucky I guess that Apple saw the potential for iCloud and bought them out, but the people who had bet on FDB before then kinda lost. You really don't want your database to be bought and made fully private tech. MongoDB was open source at the start and went closed later but never disappeared, so whilst the license switch pissed people off it didn't fundamentally wreck MongoDB as a viable tech.

    Database tech has a chicken and egg problem. Most people don't want to run their own infrastructure anymore. No clouds offer hosted FoundationDB, so people don't want to use it for that reason, which means there's no demand, so clouds don't offer it, ad infinitum. MongoDB was released around the start of the cloud era, just three years after AWS first launched, so that was less of an issue. Back then "cloud" just meant VMs and storage. And later Mongo built their own cloud offering.

    FoundationDB does full strict serializability checks, which is expensive. One trick it uses to get acceptable performance is by imposing a difficult programming model on the user. Keys and values must be small. Think individual fields of a JSON object, not objects themselves. Transactions also have very small limits in lifespan and size. You can't open a transaction and run a computation against your entire dataset in FoundationDB unless it's tiny. Everything has to complete in five seconds or else your transaction dies.

    Their website used to claim this timeout isn't even configurable, it's hard to know if it changed because the FoundationDB team at Apple don't care about marketing. Probably Apple don't care if anyone else uses it and only made it open source to make the team happy. Even quite average open source projects have better marketing. Their blog consists only of release announcements and the last one was in 2022. A casual visitor who didn't know better would think it had been abandoned years ago.

    The scalability story is unclear. It doesn't matter for most people but the biggest FDB clusters are about 100T in size. Apple say they use it for iCloud but really they use a large fleet of FDB clusters with lots of in-house tooling for balancing and moving data between those clusters. Effectively they built another scaling layer on top of core FDB.

    Even if you work through all of that, what you get is a key value store. Not really a database, it's more like the bottom layer of a database. That's why it's called FoundationDB. It's not meant to be used directly. There are layers that turn your actual data into key/value pairs in a way that offers features like schema handling, object serialization etc but they are language specific and not so well documented. Most devs on the backend will have ORMs or frameworks they already want to use, and Apple server-side is mostly a Java shop so there's a Java layer, but you can't just point Spring at an FDB cluster and go. For instance, there's no notion of a query, or a query planner or even indexes. You're expected to handle all that stuff using libraries in your app.

    So overall it's a highly solid bit of tech that solved a very small, very specific problem very well but years too late for anyone to care. Except for Apple. Good work, whichever Apple executive sponsored that deal!

philosopher1234 10 hours ago

Does anyone know of cool things built with fdb? I’ve been aware of it for a while and it seems very cool but I haven’t seen a lot of details about how folks are using it.

  • jwr an hour ago

    I am moving my SaaS from RethinkDB to FoundationDB. It's a long-term project that needs to be done very carefully (thousands of people using the app), but the rewards are significant. Thanks to FoundationDB versionstamps, I'll be able to replace changefeeds with polling, simplifying the system, and also make things much faster along the way.

    The consistency guarantees are phenomenal and writing software is much easier when you have strict serializability. Most people do not appreciate this because they do not understand the anomalies that you can get without strict serializable consistency.

  • mannyv 24 minutes ago

    From what I understand one of the big IP ad tracking services (El Toro) is built on FoundationDB.

majestik 10 hours ago

I can't put my finger on it but there's a weird tension between the two Dave's in this video. Almost like Rosenthal is trying to impress or earn the praise of Scherer.

Is there a backstory between these guys / FDB?

  • Dave_Rosenthal 7 hours ago

    Ha, well I met Scherer ~30 years ago in a high school math class and we’ve done three companies together, so you could say we’ve known each other for a bit :)