The MongoDB way.

The MongoDB way

This article is not about how bad MongoDB is, you can google it if you search for arguments. I heard the MongoDB had issues with failover, but I cannot confirm it from my experience. I heard bad opinions about this database engine, but it has good and bad sides. In my opinion, it is a good tool for coders.

Today I want to tell when and where we can use this database.

MongoDB is a document-oriented database, we can use it as we want. We can even use it in normalized data model – this is the traditional way which is used in relational DB, but when we want to use it in client-side application, it requires more “round trips” to the server.

Whole MongoDB is in JSON notation, but in fact, it is BSON ( is a portmanteau of the words “binary” and “JSON”) and it has different data types.

In this example, I try to describe what decision we must consider if we try to organize data structure.

At the end, we have changed Jessica’s surname from Doe to Doe-Morgan and the question is how we can show her activities on her social network friends timeline?  In sample 1, we have only historical data, in sample 2 we have only current data (name and surname) and in sample 3 we have all data. When we try to figure out what to show on a timeline of Jesica Doe’s friend’s we must decide what structure is best for us. Probably sample 2 will be in this case the most reasonable structure, but only in sample 3 we have full data and we can consider our answer later. Facebook do it like in sample 2 but there is another way. Another activity of Jessica will be stored with a current surname and when we use this field on our timeline it can be a nice feature… but, is this a good solution? Probably not.  Did we have to store this data every time that we write new activity? Maby the solution is historically table of users data with time of change? It depends on us.

Let’s check the MongoDB way.

Embedded data objects

Is this correct? Yes, if we want to retrieve all user data on one query with activity. Yes, if we want to show historical data. But, we don’t have any _id of a user, in this case, we have only a historical data like in sample 1 and we can’t do anything more. If we add a user_id we will have a sample 3 and we grab everything once again. In this concept, we must know our limits. Document have limits, one document with his embedded documents can’t exceed 16MB of data.

The questions.

Most common problem in creating the data structure is knowledge what we want to achieve? Probably the most common problem is which data will be bigger. In small project probably embedding the documents is the best way. When projects are bigger, we must consider how to solve the separation of data. When we talk about updates and inserts, Mongo has smart tools to update or insert, this is not a problem, even in big structures.

When we describe our project we can estimate which data will be bigger, which will grow faster. Most common problem is miscalculations. Sometimes when we develop, we can’t describe our goal. When we can ask ourselves a right question, then we can consider the right answer.

Leave a Reply

Your email address will not be published. Required fields are marked *