Connecting MongoDB With Python

Adnan Karol
3 min readOct 28, 2020

--

There is a growing trend towards NoSQL Databases like MongoDB, Cassandra, Redis, etc. But what are these and how do they differ from traditional MySQL?

Database Image

What do we learn in this Blog?

  1. What is MongoDB
  2. Using MongoDB with Python using pymongo.

Let us get started: Introduction

What is MongoDB :

  1. Whenever we speak about Database (DB) there is one common DB that comes to mind — MySQL. This is a Relational Database. These basically use tables to describe the relationship between data.
  2. On the other hand, are NoSQL database which doesn't use a table for describing the relationship, rather they can use document-based, key-value pairs, etc. Moreover, unlike SQL, they have a dynamic schema, while SQL has a well-defined schema.

Using MongoDB with Python using pymongo.

Let's move on to the more interesting part of using one such NoSQL Database MongoDB with python.

With reference to Data Scientists or Machine Learning Engineer, Data is usually stored in a Database and we must have our scripts to write queries and fetch the data from such Database systems.

This blog is aimed at the fact that the database is already existing and we simply try to make a connection and fetch some data. For building your Database you can refer to here.

Install pymongo and dnspython using pip

pip install pymongo
pip install dnspython

Usually, the Database has a connection string if it is hosted on the cloud or some other machine. The connection string is important to start the connection. (You should have read access at least)

# Install driver for python to connect to MongoDB
import pymongo
from pymongo import MongoClient
client = MongoClient("Enter your connection string")

After the connection is made, we can view the list of names of all Database present.

# Print Name of all Databases
print("\n The Name of all Databases are : ")
print(client.list_database_names())

A Database in MongoDB has collections, you can understand them as There is the Big DB, which has inner databases and these databases have collections within them. To view the collections within a database, we first need to connect to that particular database.

# Connect to a Database
db = client.database_name
# Print Name of all Connections in the Database
print("\n The Name of all Connections in the Database are : ")
print(db.list_collection_names())

On a successful connection, we can see the names of Collection within the Database database_name. Next, we want to fetch some data based on some query. The best way to understand this is by using examples. As Data Scientists it's optimum if we can write queries and fetch data.

Let me make some test data for understanding, lets assume that collection collection_name has the dummy data :

{“tag1”: True, “tag2”: False, “tag3”:” Adnan”, “tag4”: 5},

{“tag1”: False, “tag2”: True, “tag3”:” Abhijit”, “tag4”: 7},

{“tag1”: True, “tag2”: False, “tag3”:”Mayank”},

{“tag1”: False, “tag2”: True, “tag3”:” Niloy”}

collection = db.collection_namequery ={
"tag1":True,
"tag2":{"$ne":True},
"tag3":"Adnan",
"tag4":{"$exists":True}
}

Let us try to understand what we did above, We wrote a query that does this: match the data in the collection collection_name where tag1 is set to True, where tag2 is not equal to True, where tag3 matches string “Adnan” and finally tag4 exists and is not empty in the record.

Let me try to execute the query and check the output :

for document in collection.find(query):
print(document)

The output would be :

{“tag1”: True, “tag2”: False, “tag3”:” Adnan”, “tag4”: 5}

You can practice some queries to understand working better. The Data is fetched as Dictionary and can be used further for processing.

--

--