Hacker News Clone

Everything You Know About MongoDB Is Wrong

by scapbi on 11/26/2020, 3:20:24 AM with 61 comments

by zaptheimpaler on 11/26/2020, 4:57:48 AM
> But this is not the case! Some of these parts have never been the case - MongoDB has never been "eventually consistent" - updates are streamed and applied sequentially to secondary nodes, so although they might be behind, they're never inconsistent.
This is what eventually consistent means! If i wrote (key=X, value=Y) to a primary, then read X and see a value thats not Y (because the secondary node hasn't caught up yet), that is inconsistency. Strong consistency (e.g in a single node SQL database) would mean its impossible to read stale values.
by im_down_w_otp on 11/26/2020, 5:06:56 AM
That is a kind of strange thing to title a post hosted by the corporate blog itself.
1) If that's actually true, then both Mongo's marketing department and its technical writing staff have been doing a really, REALLY poor job for years.
2) "You don't know nuthin', dummy!" is not a great lead-in to correcting people's misconceptions.
It also appears to be the case that what I knew about MongoDB was actually true all along, and most of these "myths" are just Mongo trying to redefine the terms or problems to be something they're not. :-/
by jiggawatts on 11/26/2020, 4:57:33 AM
Interesting that they referenced the Jepsen report, even though it was a damning indictment of MongoDB's unreliability... and then Mongo referenced the report on their website as a positive, despite the negative result in their report. That was a bit of scandal not that long ago, so it's odd to see them still talking about it.
It's like those supplements that say "scientifically tested!", which is a true statement even if the scientific test found that the supplement is ineffective and does nothing.
by cookiengineer on 11/26/2020, 6:04:02 AM
The title should rather be "Everything MongoDB knows about databases is wrong".
They're trying to establish a narrative about their product which is an already accepted, defined terminology, and which has existed for decades in the Computer Science community.
Eventual consistency doesn't need to be redefined.
If the general public assumes a database to have relational features, maybe it's time to rebrand mongodb into mongostore or something?
If mongoDB tries to be the "eierlegende wollmilchsau" of databases, they have a too large pivot that they're trying to deliver to as a market. Mongodb isn't made for ERM based scenarios, maybe stop trying to push it into that. If people would've wanted that in the first place, they'll likely have chosen ArangoDB or an SQL based database anyways.
To be blunt, I don't understand why they seemingly are so irrational to push the narrative.
Maybe they've seen that most of their customers stop using their product after a while?
If so, then it might be time to stop promising things you conceptually cannot and should not want to deliver.
Also, dear mongodb folks: Web scale is definitely not mongodb as a product. For decades devs have used memcached with SQL based databases and it scaled beyond imaginations, and beyond what a non DDoS testing environment can reflect.
If your product isn't webscale because you do not want to be held responsible for adapters that you refer yourself in your own docs to, maybe it's time to take responsibility and introduce a q&a step regarding a minimum performance standard that every adapter has to fulfill to be recommended?
Regarding debian package version: I personally am gonna stop here, because I hate people complaining about shit, not taking responsibility, and expecting others to do work for free. If debian's policies annoy you, host your own damn ppa. Or sponsor them to help them work on that. But this... this is not okay.
by serbrech on 11/26/2020, 5:33:22 AM
> “There remain some differences of opinion with Jepsen about MongoDB's defaults, but if you read the manual, you can rely on MongoDB to store and maintain your data consistently and robustly.“
My personal opinion, is that a database should not lose data _by default_
by exabrial on 11/26/2020, 5:16:30 AM
If you want to tell your boss you're storing the data, but plan to quit before anyone tries to actually use said data before seed funding runs out, mongodb is the perfect choice.
The nice thing is, there is an API compatible version of mongodb distributed with most Unix systems, installed by default in /dev/null. I applaud the authors of mongodb for achieving this level of market penetration in such a short time, defying years of experience from the real world.
by amir734jj on 11/26/2020, 5:04:54 AM
I used to work in a managed investment department of a large investing company in midwest. We used MongoDB as a single source of truth for daily market analysis data. We used C# .NET Core. I loved working with mongo, although it gets slow overtime but it was enjoyable. One thing I really hated was the C# driver. I just hated it. It fails translating complex LINQ expressions into mongo query and its BSON implementation is extremely slow. We literally wrote the data-access-layer in python wrapped it as an API and we got a 10x speed boost. Eventually, we replaced it with Postgres and utilized its JSON columns, and retired MongoDB. Good memories learning its query syntax and investigating where the slowness was coming from.
by fatbird on 11/26/2020, 5:38:33 AM
You can do joins with queries that we call aggregation pipelines. They're super-powerful,
And hot garbage for performance. Unlike relational indices on foreign keys, mongo simply... does a lookup for each doc in the pipeline step. Indexing the looked up collection does nothing extra in aggregation, you're just doing a repetitive manual join. A simple aggregation query that I wrote that added a value from a second collection based on its timestamp compared to the original document's timestamp, took at least an order of magnitude more time than without the lookup.
Aggregations on a single collection are very performant. But never try to lookup or join anything.
Scaling data is mostly about RAM, so if you can, buy more RAM. If CPU is your bottleneck, upgrade your CPU, or buy a bigger disk, if that's your issue.
Never listen to this person about anything.
The mongo I'm dealing with now scales by database. Each entity has between 20-100GB of data in its own database; we're adding entities continually. If I try to replicate for performance, I'll be replicating everything--there's no selective replication. If I shard, I'll be sharding within a collection, which is the equivalent of striped RAID--great if that's what you need. I don't. I need to shard at the database layer. I need my queries routed according to the database at which they're aimed, not by the sharding key. Can I? Not a chance in hell with any of the existing scaling mechanisms from Mongo. My current mongo VM is already the largest Azure offers. How do I add more RAM to that?
by gkoberger on 11/26/2020, 5:20:45 AM
That's... not the kinda blog post you want on your own blog.
"We're really bad at explaining Mongo and nobody knows how to use it."
by sairamkunala on 11/26/2020, 4:59:54 AM
i will leave this here for anyone to validate the original article - http://jepsen.io/analyses/mongodb-4.2.6
by seibelj on 11/26/2020, 5:07:06 AM
Back in ~2012 I interviewed at a company that chose Mongo as its database, and my white boarding question was to implement joins in nosql as that was a recent problem they were solving. After doing the problem I asked innocently, “but all your data is relational - why use mongo at all?” And the CTO went red in the face and exploded, actually yelling at me about mongo’s many benefits (just not, ya know, anything a normal database provides). Needless to say I didn’t get the job!
by hermanradtke on 11/26/2020, 4:52:44 AM
The only thing I need to know is not to use it.
by whoisjuan on 11/26/2020, 5:52:51 AM
> MongoDB is an ACID database. It supports atomicity, consistency, isolation, and durability.
Yeah. If you dismiss the part that it took them 4 versions and 9 years to be able to make that claim.
ACID by compliance. Definitely not ACID by design and hence not trustable for transactions. Even this article says that MongoDB shouldn’t be used for transactions.
MongoDB seems great for when you need to capture large streams of data in a create-once/read-forever type of way. Probably why is popular among large enterprises.
by mulmen on 11/26/2020, 5:48:38 AM
There’s no question mark but Betteridge’s law of headlines still applies.
by pood on 11/26/2020, 5:31:27 AM
Having just (TODAY) finished moving a massive mongodb to postgres (w/ jsonb) I think I can say everything I know about mongodb is correct; it usually is a pain in the ass (particularly for larger databases).
That said, mongo (inc?) is has a p good marketing team.