At Strategies we started using MongoDb for our development a number of years ago. MongoDb was new and interesting at the time, and the word was that it is very fast.
At first we used it alongside MySQL to achieve specific functionality like document storage, but more recently we have used MongoDb as the sole database on some (but not all) of our websites, including one which serves many millions of records.
Naturally we have learned some lessons during this time, and I am going to share some of those here, and in later articles, in the hope that others can learn from them also.
Lesson 1 : MongoDb is not a Relational Database
MongoDb is a Document-Oriented Database, it is not a Relational Database Management System (RDBMS) like MySQL, MS SQL Server or Oracle. This comes with many advantages, mainly that MongoDb is schema-less, and very flexible and fast (when used correctly). However, this also means that MongoDb does not have some of the functionality of Relational Databases.
MongoDb does not support joins between two collections (equivalent to tables in an RDBMS). This is because all of the data in a Document is expected to be stored within the Document, as opposed to splitting data into separate Normalised tables in an RDBMS. This does mean that there is likely to be duplicated data, and any changes to the duplicated data needs to be handled manually.
MongoDb does have a mechanism for referencing other Documents called DBRef, however it does not include functionality to maintain Data Integrity. In MySQL you can use a Foreign Key Constraint to ensure that any references to data that is deleted; are cleaned up so there are no broken references (or “Orphaned Data”). If a document referenced by a DBRef is deleted, the DBRef will become broken and needs to be dealt with manually.
Why is this important? It is important because it means that; while MongoDb can be used for any application, it is not the best choice for all types of application. In a case where the data in the application is clearly relational, and requires tight integrity, an RDBMS is likely to be a better choice. In a case where data integrity is less important, and speed is more important, MongoDb (or another Document-Oriented Database) is likely to be the best choice.
Lesson 2 : MongoDb is very fast (when used correctly)
One of the reasons we chose to start using MongoDb was the buzz around how fast it is; specifically when compared to Relational Databases like our go-to RDBMS, MySQL. However the performance does not come for free.
Similarly to RDBMS, MongoDb relies upon correct use of indexes to perform fast queries. Correct indexing in MongoDb is an entire article in itself and lies outside the scope of this article, however when researching this subject do be sure to check that what you are reading is relevant to the version of MongoDb you are using and note the following pitfalls.
Before version 2.6, MongoDb could only use a single index per query, including for sorting. Therefore a compound index containing all fields referenced in your query was needed for maximum performance. In version 2.6 and after, indexes could be intersected in some cases.
Before version 2.0 MongoDb would not use an index for an $exists query unless forced by using hint(). An index will be used after version 2 (however this was broken in 2.2 and not fixed again until 2.6), but it is still recommended not to use $exists since it will likely need a full san of the index, whereas replacing it with {$ne: null}, for example, will be much faster.
I hope these first two lessons help in some way, watch this space for the next parts in this series where I will discuss more of our lessons learned while using MongoDb.