What’s the problem with that? In my previous team, we had a structure with four levels of nesting where we only ever needed to query the first two levels. At first we used Postgres with normalized tables, but it was just slow as hell. Switching to MongoDB actually made our performance issues vanish.
Of course it all depends on what kinds of queries you need to run, but I don’t think that large JSON documents are necessarily a problem.
They’re talking about relations between data. For example, when you delete a user, you may also want to delete their stored data.
To some degree, this is less of a problem with document databases, because they don’t force you to chop your data into small parts like relational databases do (e.g. you can have lists of that user’s stored data as part of the JSON document). But you will likely still need some relations at some point.
Chances are you have a layer in your application code which ensures these relations that way.
Which is fine in my opinion. With relational databases, there’s also often some relations which you cannot model in the database.
But yeah, it requires somewhat more software architecture awareness, to not lump the relation checking logic into general application logic. And you can’t connect a second application to that database, without having to implement the relations another time or at least pulling them out into a shared library.
I’m currently building something using Mongo as the DB, and have so far been making sure to assign the user ID to everything that relates to that user when it’s created.
Wouldn’t you have to do something like that in MySQL anyway to ensure that the entries related to each other?
Oh yeah, I’m saying that relational databases push you even more to assign IDs to every miniscule piece of information, especially if you’re following best practices (3NF or similar).
For example, you’re not supposed to say that a user has a list of interests, you’re supposed to say that there’s users with a user_id and then there’s user_interests with a user_id and an interest_description, so two separate tables.
If those interests can be indexed, then you’d want three tables:
I mean, this might not be the best example, as it kind of makes sense to not always load the user interests whenever you do anything with the user, but yeah, the point is that you’re supposed to split it up into separate tables and then JOIN it as you need it.
With these RDBMS, your entire data loading logic is supposed to happen in-database, so you pretty much need to chop the data into the smallest possible parts and assign IDs to all of those parts, to give you the flexibility to access them how you need to.
Ahhh okay I see. I’ve developed sites for a decade but have never had to really consider how databases work in any great detail, which obviously now with mongo I am having to think about. Thank you for clarifying! I’ve got some reading to do 🫡
What’s the problem with that? In my previous team, we had a structure with four levels of nesting where we only ever needed to query the first two levels. At first we used Postgres with normalized tables, but it was just slow as hell. Switching to MongoDB actually made our performance issues vanish.
Of course it all depends on what kinds of queries you need to run, but I don’t think that large JSON documents are necessarily a problem.
They’re talking about relations between data. For example, when you delete a user, you may also want to delete their stored data.
To some degree, this is less of a problem with document databases, because they don’t force you to chop your data into small parts like relational databases do (e.g. you can have lists of that user’s stored data as part of the JSON document). But you will likely still need some relations at some point.
Chances are you have a layer in your application code which ensures these relations that way.
Which is fine in my opinion. With relational databases, there’s also often some relations which you cannot model in the database.
But yeah, it requires somewhat more software architecture awareness, to not lump the relation checking logic into general application logic. And you can’t connect a second application to that database, without having to implement the relations another time or at least pulling them out into a shared library.
I’m currently building something using Mongo as the DB, and have so far been making sure to assign the user ID to everything that relates to that user when it’s created.
Wouldn’t you have to do something like that in MySQL anyway to ensure that the entries related to each other?
Oh yeah, I’m saying that relational databases push you even more to assign IDs to every miniscule piece of information, especially if you’re following best practices (3NF or similar).
For example, you’re not supposed to say that a user has a list of interests, you’re supposed to say that there’s
users
with auser_id
and then there’suser_interests
with auser_id
and aninterest_description
, so two separate tables.If those interests can be indexed, then you’d want three tables:
users(user_id)
interests(interest_id, description)
user_interests(user_id, interest_id)
(N-M-Mapping)I mean, this might not be the best example, as it kind of makes sense to not always load the user interests whenever you do anything with the user, but yeah, the point is that you’re supposed to split it up into separate tables and then JOIN it as you need it.
With these RDBMS, your entire data loading logic is supposed to happen in-database, so you pretty much need to chop the data into the smallest possible parts and assign IDs to all of those parts, to give you the flexibility to access them how you need to.
Ahhh okay I see. I’ve developed sites for a decade but have never had to really consider how databases work in any great detail, which obviously now with mongo I am having to think about. Thank you for clarifying! I’ve got some reading to do 🫡
Those are rookie numbers.