Category Archives: MongoDB Tutorial

MongoDB Terminology

MongoDB Terminology (Common Terms)

Database :- In MongoDB, Database is a physical container that holds the set of collections. A database may contain zero or more collections. A MongoDB server instance can host multiple databases. There is no limit on the number of databases that can be hosted on single instance, but it is limited to the virtual memory address space that can be allocated by the underlying operating system.

Collection :- Collection is a set of MongoDB documents. It is similar to the “Tables” in relational database systems. Collections are schema less, thus documents within same collection can have different fields. Typically, a collection holds the documents of similar or related purpose.

Document :- In MongoDB, Document is the basic unit of storing data in MongoDB database. Documents are analogous to the ROW in traditional relational database systems. It is an ordered set of key-value pairs, means for every key there exists an associated value. Document are more oftenly referred to as “Objects”. Every document/object in a collection is represented in a JSON-like(key-value pairs) format. Data is stored and queried in BSON, it is a binary representation of JSON-like data.

Example:-

Field :- In MongoDB, fields are analogous to columns in relational databases. A document in a collection can have zero or more fields. A field and its associated value are stored in key-value pairs.

_id :- It is a mandatory field in every MongoDB document. It is used to represents a unique document in a collection. It works as the document’s primary key. If you create a new document without an _id field, MongoDB will automatically create the field.

MongoDB v/s RDBMS Terminology/Concepts

If you’re not familiar with MongoDB, here is a quick translation table that presents the various SQL terminology and concepts and the corresponding MongoDB terminology and concepts.

SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document or BSON document
column field
index index
table joins embedded documents and linking
primary key

Specify any unique column or column combination as primary
key.

primary key

In MongoDB, the primary key is automatically set to the
_id field.

MongoDB Pros and Cons

MongoDB Advantages (Pros)

  • In MongoDB, it is easy to replicate database across multiple distributed data centers, which results in increased availability.
  • Auto sharding feature of MongoDB makes it highly scalable.
  • It uses dynamic and flexible schema, means you need not to specify schema beforehand.
  • There is no need of conversion/mapping of database objects to application objects.
  • You can apply indexing on any field in a document, it improves the query performance.
  • It support instant, safe and automatic failure recovery.
  • It support automatic load balancing.
  • It facilitate deep query-ability on documents.
  • There are no complex joins in MongoDB.
  • Built in support for Geospatial/Location Data.
  • Powerful Document-based query language.
  • Easy to integrate with BigData Hadoop.

MongoDB Disadvantages (Cons)

  • It does not support transaction
  • No support for join operation
  • It does not support function or stored procedure.
  • Consumes internal memory

MongoDB Features

MongoDB Features

Schema less :-

It uses dynamic and flexible schema, means you need not to specify schema beforehand. Instead, you can create fields on the fly.
Ad hoc queries :-

MongoDB provides a rich Query Language that is nearly as powerful as SQL. It supports aggregation function, random sample, range queries, regular expression searches and can include JavaScript functions as well. MongoDB queries supports Aggregation features.

Indexing :-

In MongoDB, indexing can be applied on any attribute/field, it improves the query performance.
Replication :-

In MongoDB, it is easy to replicate database across multiple distributed data centers, which results in increased availability.
Load balancing :-

MongoDB’s built-in auto-sharding feature allow it to scale horizontally by partitioning and spreading out the data across multiple servers, in order to achieve increased levels of scalability which was not possible with relational databases like MySQL.
File storage :- MongoDB can also be used as a file system, using load balancing and data replication features over multiple machines to store files.This function, called Grid File System, is included with MongoDB drivers. MongoDB provides functions for file handling and manipulation. The Grid File System divides a file into multiple parts, or chunks, and stores them as a separate document.
Aggregation :- MongoDB has built in support for batch processing of data and aggregation operations. MongoDB’s aggregation framework gives you following three ways to perform aggregation –

  • Aggregation pipeline
  • Map-reduce function
  • Single purpose aggregation methods

JavaScript on Server-side :- In MongoDB, JavaScript code can also be used with queries and aggregation functions such as MapReduce.
Capped collections :-MongoDB has built in support for fixed-size collections named as “Capped Collections”. Capped collections maintains the insertion order and, as the specified size is reached, it start working as a circular queue. Capped collection can be created using the db.createCollection command as following –

Here, collection is limited to 2 MB
Geospatial :-MongoDB provides the built in support to store the location specific data, which allows us to store x and y coordinates(longitude, latitude) within documents. MongoDB uses Geospatial indexes to find relevant documents near or within a radius of a set of coordinates. Geospatial indexes makes it easy, fast and accurate to find relevant data from specific locations.

Fast Writing :- MongoDB prefers high insert rate over transaction safety. In MongoDB, once you sent the write command, you need not to wait for actual write process.In MongoDB, the concept of journaling gives you better control on write performance with data durability.

Multiple Storage Engines :- MongoDB has built in support for multiple storage engines such as WiredTiger, and MMAPv1. In addition, MongoDB has pluggable storage engine API that enable third parties to develop storage engines compatible with MongoDB.

MongoDB History

MongoDB History

2007:

In 2007, New York based software company 10gen started developing MongoDB. Initially company started it to develop as a PAAS (Platform as a service) product, but the company experienced some scalability issues with the existing relational database systems, so they started to develop a document-oriented database system named as MongoDB. The name of the database was derived from the word “Humongous”.

2009:

In February 2009, MongoDB is initially released as an open source project and the company started to offer commercial support services for the same.

2013:

In August 2013, the company officially changed its name to MongoDB Inc.

MongoDB NoSQL Database

What is NoSQL Database?

The acronym NoSQL stands for Not Only SQL”. NoSQL Database is a category of database management systems that does not compliant with the traditional relational DBMS (RDBMS) rules, and does not uses the traditional SQL to query database. NoSQL Databases are used to store large volume of unstructured, schema-less non-relational data.

NoSQL Database Type

There are 4 basic types of NoSQL databases –

Key-Value Stores :- It is simplest type of NoSQL database. In key-value storage database, data is stored in a big hash table of keys & values, means for every key there exists an associated value. Here each key is unique and the value can be string, JSON, BLOB (Binary Large OBjec) etc. Example- Riak, Redis, Amazon S3 (Dynamo).

Column Oriented Database :- In Column Oriented databases, values of a single column are stored in contiguous block. These type of databases mainly used to store large data sets. Each of the column is treated separately and data is stored in column specific files. Column Oriented databases provides better query performance over large data sets. Example- HBase, Cassandra.

Graph Oriented Database :- A graph database is a collection of nodes and edges, which is used to represent and store data. In Graph Database, a node is used to represents an entity (such as a employee or business) and an edge is used to represents a relationship or connection between two nodes, every node and edge have a unique identifier associated with them. Example- Neo4J, OrientDB, Titan.

Document Oriented Database :- In Document Oriented Database, data is stored as documents and a document can contain many different key-value pairs, or key-array pairs, or even nested documents. Every document have a unique key associated with them. Example- MongoDB, CouchDB.

NoSQL Advantages (Pros)

  • High availability
  • High scalablity
  • Flexible and dynamic database schema.
  • No complex joins or relationship
  • High performance
  • Works perfect with high volumes of structured, semi-structured, and unstructured data
  • Reduced cost
  • Supports distributed computing infrastructure

NoSQL Disadvantages (Cons)

  • Lack of Standardization
  • Lack of Security
  • Lack of Support

Difference Between NoSQL & RDBMS

RDBMS NoSQL
Structured and organized database schema Database schema is dynamic, unstructured and flexible
Database schema needs to be defined beforehand Database schema is dynamic and need not be defined beforehand
Table based databases NoSQL databases can be document oriented, key-value pairs, graph databases or column oriented.
Vertically scalable Horizontally scalable
Best fit for transaction intensive applications It does not support complex complex transaction
It emphasizes on ACID properties ( Atomicity, Consistency, Isolation and Durability) It follows the Brewers CAP theorem ( Consistency, Availability and Partition tolerance )
It uses structured query language (SQL), Data Manipulation Language (DML), Data Definition Language (DDL) for defining and manipulating the data. Query language varies from database to database.
Examples- MySql, Oracle, Sqlite, Postgres and MS-SQL. Examples- MongoDB, Redis, Hbase, RavenDb, Cassandra, Neo4j and CouchDb

MongoDB Introduction

What is MongoDB?

MongoDB is an open-source, cross-platform document-oriented database program written in C++, and developed by MongoDB Inc. It is a document-oriented database that uses JSON like document to store the record, data is stored in key-value pairs. MongoDB is used to store high volume data in high-performance enterprise application, where it provides high performance, availability, and scalability. MongoDB is a NoSQL database, it means you need not to specify schema beforehand. Instead, you can create fields on the fly.

MongoDB is used as database component in MEAN software stack, using a document-oriented database such as MongoDB allows you to work with JSON-like documents for your entire development stack. The benefit of using MongoDB is that you are able to work with JSON on the frontend (Angular), the backend (Node), and the database (MongoDB).

It is an open-source program and available free under the GNU Affero General Public License and Apache License

Why MongoDB?

  • It supports dynamic and flexible, that allow your database schema to evolve as per business requirements.
  • With MongoDB, it’s easy to replicate database across multiple distributed data centers, in order to achieve increased levels of availability.
  • With MongoDB’s built-in auto-sharding feature, it is easy to partition and spread out data across multiple servers, in order to achieve increased levels of scalability which was not possible with relational databases like MySQL.
  • MongoDB is perfect to store large volumes of data.
  • MongoDB documents(records) can easily be mapped as per object-oriented programming languages, thus it removes the need of complex object-relational mapping (ORM) layer.
  • It simplifies the development.
  • It support rapid application development.

Where to use MongoDB?

  • Catalog and Content Management System
  • Big Data
  • Data Analytics
  • Internet of Things(IoT)
  • Location-based data analytics
  • Real-Time Analytics
  • Mobile Apps
  • Data Personalization to tailor user’s experience

Where not to use MongoDB?

  • Complex Transaction Intensive Systems
  • Tightly Coupled Database Schema

Who uses MongoDB?

MongoDB is trusted by large enterprises running high-performance mission-critical enterprise applications. Below is a list of a few of them –
Adobe
LinkedIn
SAP
McAfee
eBay
etc.