pymongo and MongoDB for data accessing: Data analysis/Machine Learning(Part One)

Accessing data from MongoDB using the python library

Umakant_Shinde
5 min readOct 23, 2020

in this blog, I’m explaining how we can access data from MongoDB database and these data stored in CSV file format but before starting this I’m explaining how we can do web scraping and these all data store in MongoDB and stored data in MongoDB database how we can access using pymongo

contents of this blog

  1. About MongoDB : JSON with MongoDB & Core Processes
  2. about pymongo

I will show you an example in the next blog because it takes more time so I’m divided into two parts one parts all theory and the second part one big example

Photo by Ales Nesetril on Unsplash

MongoDB:-

firstly we go basic ways that are how we can install mongodb

firstly you need to install mongodb click here to download mongodb application

When you click on ‘click here link’ then you will face these interface and you can see here to download button and click download.

Now click next for the next process

Now click “I accept the terms in the license agreement check box ”

Click on next on button and 2 more times

Here you can see here mongodb is installing

When you open mongodb application then you will see this type of interface and when you click on the Connect button then it will give an error because without a database how will it connect? then click on “Create free cluster ” and get a free database account. I can not show you because i have already a database so click here

For more detail about getting a database click here.

when you getting an account then “mongodb-compass-1.22.1-win32-x64” the application will install.

I have created three databases and when I’m clicked on the connect button this interface is shown and I have the choice to store collections.

we have done the first step that is installing mongodb and I will give one example in the next blog because I’m decided to create a very difficult problem. I will share it.

About mongodb:-

before mongodb all database structure is based on Relational Database Management System(RDBMS) so all data stored in these format .like array for example if you store student data then one student info we store another student info one by one we store all info in the database in row format.

therefor time-consuming so much when we modify data then it will take time so much but in mongodb data are store in collection form and all data store in JSON((JavaScript Object Notation)

mongodb is fully focused on the web application,90% of work done on web application and 10% other application like software data store in the database you can you mongodb

MongoDB (derived from the word humongous) is a relatively new breed of the database that has no concept of tables, schemas, SQL, or rows.

It doesn’t have transactions, ACID compliance, joins, foreign keys, or any of the other features that tend to cause headaches in the early hours of the morning.

JSON with Mongodb

JSON stand for Java script object Notation and this format not required schema that means schema-less format, which provides flexibility in terms of database design. Unlike in RDBMSs, changes can be done to the schema seamlessly.

{ “student_Name”: “Abc”,

“Phone”: [“9999999999”, ‘’ 33333333333" ……..],

“percentage”:98}

when you see data then this format will be seen because it is different and it is schema-less which is dynamically change collection

Core Processes

The core components in the MongoDB package are

• mongod:-

which is the core database process and this command is used for accessing data, manage data, and handle data requests.

• mongos :- which is the controller and query router for sharded clusters

• mongo :-which is the interactive MongoDB shell and when you interact with javascript and this command is used for operations.

These components are available as applications under the bin folder.

MongoDB Limitations

  1. required a lot of space because data are stored in doc format.
  2. memory issue

About Pymongo library:-

pymongo is a library is used for dealing with the mongodb database. pymongo means python and mongodb is combined pymongo using this library we can connect disconnect mongodb database and read-write operation and javascript server-side program it handles operations.

when you dealing with mongo database using python then firstly install pymongo library

run

pip install pymongo

next blog I will explain in-depth about pymongo with example. this blog is just info about mongodb and pymongo. The next blog is about one big example with pymongo.

when we load data from MongoDB using pymongo we follow analysis data for machine learning to finding out which method to apply these data. my aim of this blog is to load data from the internet and store in mongodb and stored data again load in a list or data frame after that we store in CSV file and we analyze these data which method we can apply. here 3 work will learn

first we will learn fetching data from internet

second is storing in mongodb using pymongo

third is analysis data.

this task i will share you next blog.

read my previous blog .

CONCLUSION;-

installing mongodb and short info about mongodb and pymongo

reference

  1. The Definitive Guide to MongoDB A complete guide to dealing with Big Data using MongoDB Third Edition by David Hows, Peter Membrey, Eelco Plugge ,Tim Hawkins
  2. Practical MongoDB Architecting, Developing, and Administering MongoDB by Shakuntala Gupta Edward Navin Sabharwal
  3. MongoDB in Action Second Edition KYLE BANKER ,PETER BAKKUM, SHAUN VERCH ,DOUGLAS GARRETT ,TIM HAWKINS

--

--

Umakant_Shinde

Computer Science Engineer. machine learning and data science . I’m trying to cover basic level to advance level topic in the data science domain