The Beyond LAMP: Scaling Websites Past MySQL session on 3/14 9:30 was one of the most popular at SXSW, with panelists from Twitter, Imgur, Facebook, Reddit. Notes on this session: Tweets William Hertling Infochimps Sara Davies
That evening Infochimps sponsored a Data Cluster Meetup featuring a NON-RELATIONAL DATABASE SMACKDOWN between Cassandra core team committer Stu Hood (Rackspace), CouchDB core team committer Jan Lehnardt (Apache), and MongoDB evangelist Wynn Netherland (Orrka, TweetCongress). Tweets on the session Postmortem by Toby Jungen Wynn and Adam recorded it for their thechangelog.com: Audio: NoSQL Smackdown; below are my notes from it, which are a work still in progress.
Everyone’s claiming they are NoSQL, nobody knows what it means, it means many things, scala
We’re all document stores, right? 5:00 Data model. What do you think of huge documents? Define huge.
As big as the machine can fit.
Jan: Guess it differs in Mongo and Couch
Wynn: I believe MongoDB has 4G row limit. All these other competitors use JSON and are untyped. I don’t like that they are untyped. Because you can do massively interesting things with typed data. Might be sorted, you can slice little pieces out of it. If it grows large you might want just a piece of it out. Mongo uses the BSON spec so it’s pseudotyped at the filesystem level. It’s not just strings in the db, has Ints, other types, files etc.
Jan: CouchDB uses JSON, has bunch of data types, the nice thing about JSON is it’s the lowest common denominator of programming languages, can use it easily in any language with little code. JSON is really good for data that changes.
Heckler: Slow? Jan: There’s a compiled Python module that is actually fast, so shut up. (laughter)
Wynn: So, I believe we disagreee on types, and that’s all right.
Jan: The web is not really typed. Most of the people who use the web are not computer scientists. It enables everyone to share data. Having to teach them about datatypes is an arcane artifact of programming, they should be able to just stuff whatever they have in a database. Everyone who is interested in writing apps should not be restricted to that computer science approach.
Stu: I hope people developing web apps are computer scientists! But maybe not, I dunno.
Jan: iPhone App Store has >100k apps on it. It’s a different magnitude of scale, if everyone participated on the open web. The amateurs really have no clue…
WV: That’s a load of crap. How many languages are actually typeless? (besides Perl?) Everyone developing actually has types, suddenly you go to the DB and all your types disappear.
Jan: That’s not true. JSON defines types, you can do that, but you don’t necessarily have to worry about them, you don’t have to go up front and define them.
WV: You force everybody to rewrite all their programs with these things in mind. That’s a lot of work. There seem to be quite a lot of programs out there… It’s all about compatibility stuff.
Stu: Also, Hadoop is completely unstructured by default. Could do something similar with CouchDB. Dunno why I’m defending CouchDB! Type doesn’t always win.
WV: Traditional applications use a very different model, you have to rewrite your applications according to your model whether it’s type or consistency model, forces you to rethink, could be a good thing.
Wynn: Let’s talk about consistency really quickly. Cassandra has peer to peer model from Werner’s brainchild Dynamo where any node can accept a write and then if enough nodes have accepted it, it succeeds, otherwise not, and at read time you resolve all that. Dunno how I feel about the Couch and Mongo models. Mongo hasn’t actually figured out that part, right:
That’s right there is 2 second delay, is that what you’re talking about? Wynn: No, Mongo is master-slave replication.
I must admit that I’m not a core committer for these projects – I’m a Mongo fanboy, just an end user.
So if you have a data center in Washington and one in California, you can do a write in one of them, and even if the other is down, depending on your tunables, you can still succeed that write, because no one of those nodes is actually responsible, there’s no one is dedicated to a particular key.
That’s an advantage of Cassandra, but in most applications that’s not needed.
WV: Actually the whole consistency model doesn’t come from what you want at the application level. It’s an artifact of implementation abstractions, leaking up. Either you do it for for fault tolerance or for more concurrency so you can get better read throughput or better write throughput . For those two reasons you have to replicate, guarantee write to all replicas such that my reads are always consistent. Comes at huge cost. You cannot get your quorum, you may have to fail your writes. May not be useful for some apps. These are things that are leaking off from the implementations, through the APIs. If everybody could get a choice, everyone would want strong consistency, ja? But strong consistency means you have to take a lot of other tradeoffs. Main one is not being able to get much write throughput, other is that there are a number of failure scenarios in which you’ll be dead in water.
Are you saying Dynamo wasn’t user friendly? WV: No, absolutely not. No, actually, so, there’s a range of things. Dynamo predates, we weren’t the first. Consistency model is explicit. It’s not that we the first to provide eventual consistency. In fact most RDBMS give you eventual consistency, you just don’t know it! If you use conventional database there’s a delay when the logs are being shipped and if you read from the slave you do not get consistency. Ja, there is always a window. So why wasn’t Dynamo user friendly? Not only for the consistency level, but also you have to have the key which normally comes from somewhere else, there’s no way to do a list, to figure out what are my ??, you have to have the key, for example from customer database. So when we developed Dynamo it was to support shopping carts, that was one of the use cases, so it made you wade thru the database, .. storage system, you really had a key, so that’s why SV is a user friendly key value storage system, Dynamo not so much. With SV you can do lists, you can do prefix lists on one of my keys and then find things out. That stuff is not in… what’s the name of that one? (laughter)
Heckler: Isn’t S3 built on Dynamo?
No comment! (confusion)
So the answer is no because if you would be an engineer. you would know that if you have to do a list operator on top of this, that’s a completely different internal architecture. all of these systems … We have to get enormous scale. All these things consist of modules that are reused. It’s more the principles that matter not the specific implementations.
Stu: I would say Cassandra is more user friendly cause in that case cause we’re not using hashing to determine where key lives. You can do those list operations, treat it like you would BigTable from Google, and get a list of all your keys. I imagine you can do that with the competitors but Cassandra’s implementation is.. better. (laughter)
Jan: You guys focus on the big data problem. the massive scan on all the websites that have that problem, which are like 7? CouchDB is more like the personal DB that you can use for whatever you want to do. It doesn’t force you to think in these, to have these big thoughts, but lets you start small and grow gradually with whatever usage pattern you have. These guys are building Ferraris and dragsters, we (Mongo) are building 100 Accord of databases that everyone can use but get along with for a long long time.
Absolutely, but there’s a reason why Couch rhymes with ouch! (laughter) Anyone who’s used Mongo coming from CouchDB, it’s like night and day in the ease of use getting set up, getting the servers installed, wrappers for your language of choice, and suddenly I don’t have to know what I’m gonna ask for up front. (Seinfeld reference) It reminds me of when Kramer is doing the moviethon, and he says Why don’t you just tell me the movies you want to watch? It’s the same thing where you have to materialize your views up front.
Jan: Do indexes magically appear with no performance hit?
Well indexes are one thing but users are completely different.
Realisticl Can get around if I have a low edge case
Anything between dynamic and Couch is full of water.
17:45 you should try Neo then
18:00 WV: Let me tell you why these guys suck. (laughter) You should not run your own database any more. That time is passed. These guys force you to run your own database, to manage replication, to manage all of that.
Jan: What do you do if your DSL provider craps out. You’re dead in the water with a great cloud no one can reach. WV: You go to a bar, get a few beers… Jan: And your customers leave you right and left while you’re offline.
19:00 WV If you aggregate all these customers we have … You’re wasting your time. I love building this database stuff. I could build 10 more Dynamos, it’s really cool, but I’m not solving your customer’s problems, because I’m forcing them to have a lot of operational skills.
19:30 Jan: Part of what we’re doing is abstracting the database away. It’s just there, you can just use it. My mom should be able to run a CouchDB server without knowing it.
Don’t you all want to *be* one of the 7 biggest sites? So why not build for it?
WV: Actually I want to argue against it only being the 7 biggest sites. Big Data: That’s why we’re here. How many of you are not from the 7 biggest sites? Most of you. Everybody has petabyte datasets now!
Jan: Big providers like Apple, Facebook, they own all your data, all the URLs. People should be able to put their own data under the URLs they control. Privacy laws in Germany, you can’t…
WV: Yes you can! The Sept 1 new privacy law has a definition of a data processor. With SV you can use it as a data processor.
Jan: I’m thinking of a specific policy that you have to prove a user’s data was deleted on request, if it’s in the cloud you can’t do that.
WV: We comply with safe harbor rules. Data protection directorate of the EU has very explicit rules on what you have to do: have to be allowed to retrieve your data before moving it..
22:10 Werner, the rest of us agree against you in the sense that we’re all open source.
WV: You should be building better value for your customers, not better databases.
Jan: That’s what we do with the local databases. We give Salesforce as an example. You have a local version of Salesforce, if your connection is down..
Heckler: How often does Salesforce go down?
Jan: Oh, it does happen.
Wynn: Are you saying Amazon should let you download the whole database and shop locally?
I hear noSQL often. I see posts. Like Web 2.0 couple of years ago, not defined. How many people think it means big and scaling? How many think nonrelational schemas? We need to agree on terms so we can have these smackdowns.
Jan: Very fast login, looking at Memcache, Redis, Mongo; P2p replication => CouchDB; …..? => S3 and the stuff Amazon and others are doing….. if 100,000 of servers I need to keep busy => Hadoop or Cassandra
Stu: I would like to point out that this is a Big Data meetup. SimpleDB has a 10 gig limit?
WV: You have to do your own partitioning. When I think about NoSQL: Any data storage, the default application or service was a relational database cause that was the only choice. What drove us to build other DBs, was if you look closer at what your processing is, you can decompose, different steps have different requirements, for each you can find a solution that is very fast and very reliable. IT’s technology developed in the 80s that we’re expecting to 2000s requirements. If you dump all requirements in one bucket, it’s impossible to meet.
26:00 WV Not tht I think SimpleDB is .. it’s a whole bucket of solutions.
Stu: You say impossible, I say just not discovered yet.
WV:For example if you want to do internet everything, If you want inner transactions, multilevel views… if we had built infiite…
Stu: Cap theorem, we’ve all heard of it. none of us have transactions, so skip that. .. has transactions?
WV: .Conditional, which are actually in line with eventual consistency. Under the covers, SQL DB is still an eventually consistent system. There’s just on top so you can use both.
Stu: I’d just like to point out we have Cassandra users with multiple terabytes per node. Twitter, Digg, Reddit, FAcebook.
Jan: Couch supports that: BBC,
How many sites started supporting that scale.
If they had seen the future they would have started on Cassandra!
WV: Think about anybody who builds a Facebook game today. You can go from 0 to 25 million users in a month. Imagine all the logging, objects you hgave to keep around. You run the marketing campaign on the web,it’s not just. it’s social gaming…. terabytes of data quickly.
So let’s talk about scenarios. Can Couch or Mongo update… ?Can Cassandra update documents incrementally
Stu: Cassandra can update incrementally. We have very large rows. People build indexes within a single row
Updating a key in a hash?
WV: You guys are open source. So if you put out a release, do your customers have to take the database down.
Jan: No. CouchDB has a very robust storage model. Same file format, hasn’t changed for several versions. On top of that CouchDB s written in Erlang which lets you update version at runtime, live upgrades built in.
Stu: Cassandra is changing file format soon, you will have to restart the cluster. Never say never.
WV: How with 10000 notes? Stu: Rolling restart. WV: How long does that take?
WV: You should not be worried about this stuff. This so old fashioned, so 1990s.
Stu: I disagree. With Cassandra You can run.a single node, you can get another node running easily. We have 45 node installs, Twitter running on 45, FB on 150. It’s easy enough to grow your cluster. It may be easier than using EC2!
Q: I’d like to bring it up a notch. I’m a developer, I write Erlang, but it’s your data. Replication means any copy of the data. None of these guys, we’re zigging, you’re zagging. I want to share photos of grandma, don’t want to ask Zuckerberg any favors.
WV: … key value just mapp addressable stuff, that’s the way to go.
Stu: Does Grandma know how to user curl? I assume you have to develop an app for her…
In terms of performance? We don’t even need to talk about it because Cassandra has you guys topped.
Jan: The properties it comes with is … Mongo.. tens of thousands of connections it supports. without falling over.
Wynn: I would argue Mongo is fast enough. It’s in C.
Jan: You don’t have a concurrency story. It doesn’t scale concurrently.
WV: How easy is it to hook up? caching all over the world. SV doing 35000 transactions per second?
Stu: Cassandra can do 25000 requests per second per node!
Q: Talk about transactions. Are transactions fixed?
Jan: He’s asking about transactions. Who needs transactions, raise your hands? So that’s your answer. (laughter)
35:00 WV: Transactions have nothing to do with relational databases. You get some ACID guarantees, unrelated to relational
Stu: Also noSQL is about using the right tool for the job. Build transactions with a tool like Zookeeper.
Ecosystems: Cassandra has a few big installs. Cloudkick, Twitter, Facebook
Couch: BBC, Canonical. Not as big as you can get but probably a few more than you have.
WV: Someone tweeted remark I have gone too far. You protect yourself on multiple different levels. .. There’s really techniques you can use to protect yourself from these kind of failures. Just fancy…
Stu: So how about wide area replication. People are geographically distributed. Cassandra supports natively.
Jan: Couch has multi master replication built in. Just have that.
Mongo: Believe master replication coming.
But it never works with your data model!
WV: Metadata will never leave EU
You get geographical.. reuation other things as well
Stu: BigTable recently had a outage, I think App Engine. I love google is very open about the cause, out of sync between data centers. Is Mongo planning to break that?
WV: I mean there are advantages on all sides. As always the CAP theorem is .. you get to do the tradeoffs. One of the exercises in Dynamo was we were giving the hands of the developers. Plus I think the .. innovation is. the choice do you want to consistency model…. can always write to …. it always works.
Stu: I don’t actually have a response for that. I guess it’s possible in CouchDB just because no node knows whether it’s responsible for something
Q: Why not just buy a huge machine and scale up not out? Who needs NoSQL?
Eventually that machine goes down, sticky situation, have to use patches to MySQL or have your ops team implement
Back to the big data vs schemaless question. If you compare Mongo to .. highly productive. Let’s face it, a lot of the data you use is not in house you are consuming data from other places, JSON hashes. NoSQL you can just stash the hatch
Jan: When I see people writing a Ruby or Java app with huge middle layer, huge waste of time. We just have .. and a jquery guy. Having a http based db means
WV: Sometimes existing software still needs relational databases. However there are a ton of applications where if you use ActiveRecord or any standard ORM, it requires MySQL! Developers don’t care. But as soon as you scale or reliability becomes an issue, it becomes .. have all these … that will kill you.
Jan: ActiveRecord is fine but about 25000 lines of Ruby code! CouchDB built with simplicity in mind. Our DB is smaller than their wrapper! Bloated middleware is boring, slow, just plain sucks.
you’re going to give a wb designer a CouchDB database?
Q: Data models
There are names for what we do. In relational you want to normalize, in nonrelational you want to denormalize. Just really that simple. Duplicate, that’s what we say.
Closing question: If you couldn’t use your own product, what would you use?
Because all the other languages are boooooring! But does it scale?
Stu: Riak’s intersting but closed source. Voldemort doesn’t have ordered keys and I love ordered keys.
Wynn: If not Mongo, maybe Couch, depends on scenario, dynamic. Check out Redis or other systems that should be here. Hope I didn’t deter anyone form Mongo
WV: One DB that’s left out – Neo4j is different from others. It’s stored as graphs. Take any social application, multiple relationships, multiple connections, Neo4j just rocks. But how do you partition? Wynn: It’s a CS problem.
Thank you, the Nonrelational Database Smackdown! Woo hoo!