Table of Contents
1. Introduction 2
1.1 Purpose 2
1.2 Background 2
1.3 Resources 2
2. Body of Discussion 2
2.1 What are NoSQL Databases 2
2.2 CAP 2
2.3 Base 3
2.4 Polygot Persistence 3
2.5 MongoDB 4
Conclusion 4
References 4
1. Introduction
1.1 Purpose
The purpose of this report will be to understand what a NoSQL (Not Only SQL) database and document database is, specifically MongoDB while looking at the document database.
1.2 Background
According to (Edlich, P. 2015) he states that “The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly.” When describing why NoSQL databases were created.
1.3 Resources
This report will feature references to websites, books, slides and lecture documents for the information used.
2. Body of Discussion
2.1 What are NoSQL Databases
NoSQL databases are databases designed to run on clusters of computers/servers, built for the ever-increasing data storage needs for websites. Devised as a way of scaling databases horizontally which is a challenge with traditional relational databases. Scaling horizontally is the ability to add more computers/servers as nodes to a database. These “clusters” work well with write-heavy systems and allow increase storage and processing power limited only by the number of connections you can have on the network. Defined as No-Schema, No-SQL data structures mean they are not limited to the original data structure. Objects and fields etc can be implemented at
There is also a much talked about database called Cassandra which also needs to be discussed. It was originally developed by Facebook as open-sourced in 2008 [6]. Facebook was among the first to try the system for its inbox search system, which controls and stores in its disk space, and with the high performance of the system within its service level agreement requirements more applications like Netflix, Twitter etc. embraced Cassandra as their storage engine as well as backend for their streaming services [9]. What is Cassandra? Based on many definitions, Cassandra is a type of open source distributed database that is highly scalable, high performance designed to handle big amounts of data between many commodity servers that guarantees high availability without failure. Its main duty is high performance, also with its robust clusters among several data centers, as well as providing low latency operation for its various clients which is why businesses love it. It was written in Java language. Cassandra in accordance with research conducted on NoSQL systems concluded that its scalability, ability supersedes rest of the database management system with its largest number of nodes. Designed as a distributing system, which supports replication and multi replication as well as the ability to replace failed nodes without downtime [2]. Cassandra supports other open source like Hadoop, Apache Pig etc. It is similar with relational database since
Graph Databases – There are few NoSQL Databases store information in a graphical model which scales athwart numerous machines. This model is appropriate for data relationships which are preeminent portrayed as a graph, for example, public transport links, social relations, network topologies or road maps. (Zaki, 2014).
SQL has dominated databases for a considerable length of time. The shared database show began to ascend in the 1970s and promptly grabbed balance. Its usage been in existence for forty years and sometime later, SQL is so far, the most used sort of database. As shown by db-engines.com, the four of the leading five most prominent databases are social; the main NoSQL database to get through the best five is MongoDB, which has overwhelmed PostgreSQL's fourth-place. A part of the best locales out there uses SQL to inquiry their information, including Facebook and Airbnb. NoSQL will be around in the future because it reflects the ability to give significant functionality, and performance benefits for a
For the purpose of this paper, we are going to focus on these three type of NoSQL database BigTable, Cassandra, DynamoDB.
With guidance from Watson, the concept of a database was created during the 1920’s. IBM and DBMS was literally born out of a spark flamed on by the Bell system (Newton, 2004).
MongoDB was first developed by the software company 10gen now called as MongoDB Inc. in October 2007 as a component of a planned platform as a service product, the company shifted to an open source development model in 2009, with 10gen offering commercial support and other services. Since then, MongoDB has been adopted as backend software by a number of major websites and services, including Craigslist, eBay, Foursquare, Source Forge, and The New York Times,
With the expansion of the internet, there has been an exponential growth in the data being collected from various social media, searching patterns and online transactions. This rapid growth of data has become difficult to be handled by relational database, thus they have been replaced with NoSql databases. The NoSql databases are distributed databases that have the ability to store and process large volumes of data.
.In this paper we will examine the key features of the database management system MongoDB. Day-to-day information is growing in gigantic amount. Generated information include predominant information and it will have to be analyzed for gathering essential expertise. On the whole, relational databases are used so as to system the data. These, ways works successfully for small amount of knowledge. What if the data is very tremendous? To avoid this problems Mongo databases are introduced. MongoDB is a cross-platform document-oriented database. Classified as a NoSQL databases. NoSQL meets the requirements of the large-scale distributed computing environment, which provides scalability, high availability, high performance and reliability. NoSQL databases are increasingly used in big data and real-time web applications. Using NoSQL provides the benefit of storing data in schema less structure. NoSQL is not a brand new database technology; yet, it provides the possibility and flexibility of handling complex semi-structured data and optimizes solutions to different types of data in this massive and data-intensive era of large-scale computing.
For the challenges we are facing be it technical or functional we find a NoSql data base as a best fit. We found out that NoSql incorporates a wide mixed bag of various database technologies and were produced in response to the rising data needs. Also when in comparison to the RDBMS present in the market NoSql provides an enriched performance and better scalability solutions. So in search of the best fit as our solution we searched out various types of NoSql database types and found out about Document databases, Graph databases, Key value stores and other similar types. Let’s explore various market players in each of the type and find the best one.
There is a lot of buzz around Big Data and the NOSQL movement these days and rightly so. The issues with data have essentially been two-fold: find cost effective ways to store ever increasing amounts of data and information, and find ways to mine this information to extract meaningful Business Intelligence.
NOSQL is an emerging class of non-relational database, used to handle Big Data, it stands for Not Only SQL which solve the problem of processing unstructured data, considering that this non-relational database does not use a schema, and does not relay on the table/key model used in RDBMSs (Relational DataBase Management System).
In Nowadays, there are two major of database management systems which are use to deal with data, the first one called Relational Database Management System (RDBMS) which is the traditional relational databases, it deals with structured data and have been popular since decades since 1970, while the second one called Not only Structure Query Language databases (NoSQL), they are dealing with semi-structured and unstructured data; the NoSQL types are gaining their popularity with the development of the internet and the social media since April 2009. NoSQL are intending to override the cons of RDBMs, such as fixed schemas, JOIN operations and handling the scalability problems. In this paper we will review one of the graph database (Neo4j), which the graph database is part of the emerging technology that is called NoSQL and compared it with one of the traditional relational database (MySQL). MySQL, it is being another name for Relational Databases and it has been used for a long period time until now. However, with the emergence of Big Data there was clearly a need for more flexible databases. Facebook 's Graph Search using Neo4j, a graph database, is an application which clearly displays how relationships need to be modeled in a more efficient and sophisticated manner than using conventional relational models. In this paper, we will make a compare between MySQL and Neo4j based on the features like ACID, replication, availability and the language that is used in both of
A No-SQL (often interpreted as Not Only SQL) database provides a mechanism for storage and retrieval of data that is modelled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability.
NoSQL databases, including MongoDB, Redis Labs, Cassandra, and the graph database, Neo4J, have also emerged. Some of these tools run the entire database
Information technology continues to revolutionize the interactions of mankind in various ways, through social media, business, education and other channels. The internet has made it possible to transmit large data across many networks. These networks have made it possible to store, access and query billion of data from large databases. Innovation has given rise to special language used to manage and access all sorts of information within various databases know as SQL. Recently a new generation of SQL known as NoSQL has been developed. NoSQL store related data in JSON-like, name-value documents and can store data without specifying a schema. One such type of NoSQL database that has been developed is the IBM Informix