A Study of NoSQL Implementation Techniques Utilized in Facilitating Business Intelligence
Group #2
Meghana Balihallimath
Yadnesh Bandekar
Sri Kartheek Dalapathi
Sarah Fulkerson
Manikandan Narayanan
Claudia Rodriguez
INSY 5337
May 5, 2015
Abstract: Introduction to NoSQL
NoSQL databases are a significant departure from the relational model that has dominated the business world for the past few decades. Standing for “Not Only SQL,” these products are all some variation of a non-relational, key-value pair database, and they are becoming very popular with companies that use Big Data and prioritize speed or availability over consistency of data.
There are four main types of NoSQL databases. The simplest NoSQL databases
…show more content…
These databases rely on the internal metadata of the documents in order to index the data, which may vary from document to document. Finally, graph databases excel at dealing with data that is highly interconnected, such as the relationships on a social networking site. The database consists of nodes which store both data and metadata about the relationship of the node to other nodes. Graph databases are frequently limited by the need to embed the entire database on one device.
In this paper, we explore a popular example of each type of database and examine what kinds of problems these products are best suited to solve.
Key-Value Databases: Redis
A key-value database is a type of unstructured database designed for very fast retrieval of information. The data is referred to as the “value” and is linked to by a unique “key” which can be composed of several different data types. It is quite different from a relational database in that it does not store values in tables, rows or columns. A relational database “User” table might look something like this:
ID First Name Last Name
1 Abc Efg
2 Hij Klm
3 Nop qrs
Let’s say we want to create database table for User records, but the information differs from user to user. Using a key-value DB, the records might look something like the following:
Key: 1 ID :
Kevin88 First Name: Sam
Key: 2 Google mail: jb@gmail.com Location: London Age: 37
Key: 3 Mavs ID:
Mannone@mavs.uta.edu Last Name:
Mannone Status :
Data objects can model relational data or advanced data types such as graphics, movies, and audio. Smalltalk, C++, Java, and others are objects used in object-oriented data. The object-relational is a combination of relational and object-oriented databases. Traditional and advanced data types can be used to construct database management systems. These systems can connect to a company’s website and update records as needed. Database Approach The main purpose of a database is data storage that can be stored and retrieved when needed. A popular common language called structured query language (SQL) is used to store and retrieve data in relational database. This language enables the systems to run a report or modify data or remove the data from the database. A database management system (DBMS) controls all aspects of a database, this is not limited to the creation, maintenance, and use of database. The DBMS ensures proper applications are able to access the database. An important purpose of a DBMS is to maintain the data definitions (data dictionary) for all the data elements in the database. It also enforces data integrity and security measures. Data Models Data models provide a contextual framework and graphical representation that aid in the definition of data elements. In a relational database, the data model lays the foundation for the database and identifies important entities,
Since 1960 and beyond the need for an efficient data management and retrieval of data has always been an issue due to the growing need in business and academia. To resolve these issues a number of databases models have been created. Relational databases allow data storage, retrieval and manipulation using a standard Structured Query Language (SQL). Until now, relational databases were an optimal enterprise storage choice. However, with an increase in growth of stored and analyzed data, relational databases have displayed a variety of limitations. The limitations of scalability, storage and efficiency of queries due to the large volumes of data [1] [2].
Graph database: Strength: designed for data whose relations are well represented as a graph and has elements which are interconnected. Graph databases are well-suited to irregular and complex structures. Weakness: Relationships are stored at the individual record level and uses more
A key-value store database has a set of keys and values, and each value is associated with a key. The implementation of key-value store database is actually a distributed hash table (Stonebraker, 04/2010). Key-Value Stores(KV), which are normally known as a model of NoSQL databases, are widely deployed for data operation and management in purpose of enhancing Internet services due to better scalability, higher efficiency and more availability than existing relational databases systems (Wang, et al., 2014). Because KV stores sacrifice relation model in exchange for fast writing, and they are often featured with simple methods like “put()”,”delete()” and “get()”.
The modern RDBMS advancements are not capable of supporting unstructured information with ideal space necessity. The plan winds up plainly mind-boggling and is henceforth troublesome for designers. The requirement for unstructured information administration is so annoying with conventional RDBMS arrangements (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). Moreover, RDBMS turns out to be an exorbitant answer for creating light-footed web applications with direct information investigation necessities. NoSQL is developing as a proficient possibility in this situation, which connects the issues related with RDBMS innovation. The market development can credit to creative dispatches of NoSQL arrangements, and collective endeavors by NoSQL sellers and clients. The endeavors of organizations, to enhance their market offerings, are creating the request of NoSQL, as a back-end bolster (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). The emergence of agile software development is creating the demand for NoSQL (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). They offer users much more avenues to accept data in many different forms. NoSQL is adaptable as SQL but offers many more uses that can apply to many organizations.
Column-based or wide column NOSQL systems: These systems segment a table by column into column families where every column family is put away in its own records. They additionally permit forming of data qualities. Chart based NOSQL systems: Data is spoken to as graphs, and related hubs can be found by navigating the edges utilizing way expressions Data with the accompanying attributes is appropriate for a NoSQL system firstly, Data volume becoming quickly secondly, Columnar development of data then, Document and tuple data Lastly, Hierarchical and graph data. Data with the accompanying qualities may be more qualified for a conventional relational database management system is On-Line Transaction Processing required atomicity, consistency, disengagement, toughness prerequisites (ACID) then Complex data relationship and Complex question prerequisites [2] Apache Cassandra are example of BigTable-style Databases Oracle Coherence, Kyoto Cabinet is case of of Key-Value Stores. mongo DB and Couch DB is example of document database and neo4j and flock dB is case of graph database. [4]. I have selected document base data modeling to compare and contras with relational data modeling.
The need to store and evaluate data is a perpetually growing field in the world of information systems. From the days of using flat files to very large database management systems that store petabytes of data in real time, the practice of building information from data continues to evolve. Today, the relational data model is quite ubiquitous and is used in a plethora of information systems ranging from accounting systems, banks, retail business, and scientific usage. It is important to understand the concepts involved in data modeling for a relational database management system in order to build an effective and efficient system.
MongoDB is a NoSQL document database that is scalable and flexible but allows querying and indexing. MongoDB is free and open-source, so it can be changed to suit any needs. (MongoDB, 2017b)
MongoDB is one of numerous cross-stage archive situated databases. Named a NoSQL database, MongoDB shuns the customary table-based social database structure for JSON-like archives with element constructions (MongoDB calls the organization BSON), making the combination of information in specific sorts of utilizations less demanding and quicker. Discharged under a mix of the GNU Affero General Public License and the Apache License, MongoDB is free and open-source programming.
In this paper, we will review one of the graph database (Neo4j), which the graph database is part of the emerging technology that is called NoSQL and compared it with one of the traditional relational databases (MySQL). MySQL, it is being another name for Relational Databases and it has been used for a long time period until now. However, with the emergence of Big Data there was clearly a need for more flexible databases. Facebook 's Graph Search use Neo4j, a graph database, is an application which clearly displays how relationships need to be modeled in a more efficient and sophisticated manner than using conventional relational models. In this paper, we will make a comparison between MySQL and Neo4j based on the features like ACID, replication, availability and the language that is used in both of them.
A lot of speculations have been raised on whether modern NoSQL database is vulnerable to NoSQL attacks or not. The aim of the paper was to research on this issue and after thorough, the paper identified that modern NoSQL database is vulnerable to NoSQL attacks. The problem in the research paper was to identify how modern NoSQL database is vulnerable to NoSQL attacks. Use of JSON to inject NoSQL attacks, lack of admin authorization use of clear text and use of PHP applications to inject NoSQL attacks on the database are some of the reasons that were identified to cause the big problem of NoSQL attacks in the modern NoSQL database. However, solutions to the above problems were identified in the research. Some of these solutions include use of encrypted texts, use admin passwords, input validation and Bind the NoSQL process to only a single interface/IP among others.
This report basically describes the process of design NoSQL systems for data persisence and implementation of design and the solution of tasks that we are required. The dataset we worked with is a music dataset from lastfm and the designs for MongoDB, HBase and Neo4j are based on the dataset features and given queries. The implementation includes creating databases, setting up the schema and running queries, followed by testing the performance. There are also iteration designs for each system in order to gain higher performance.
For example, Facebook which is the most popular social networking website recently announced their adoption of a NoSQL based graph data store for efficient storage of user data. In other words, NoSQL has already made its way into the enterprise. However, just like every other widely accepted technology, NoSQL has its own set of advantages and disadvantages. It is important for an enterprise to quantify the pros and cons of a particularly new database technology against the already existing solutions based on their custom requirements. For example, legacy enterprise applications may require extensive community support from their database vendors. Moreover, traditional relational database vendors such as Oracle have already established themselves for providing excellent support. On the other hand, NoSQL has been rapidly growing since the past few years and is consistently evolving in terms of big data handling, data warehousing and lesser complexity. Hence, there is a need to study the current market of data stores based on the most popular NoSQL data stores and how well they fair against the widely accepted traditional database systems. This requires a study of the commonly used NoSQL data stores.
The purpose of this report will be to understand what a NoSQL (Not Only SQL) database and document database is, specifically MongoDB while looking at the document database.
A graph database represents data and relationships between this data using concepts from graph data structures like nodes, edges and properties. Nodes represents the data entities, properties represent information about the nodes and edges which connect two nodes or a node and a property represent the relationship between the connected elements. [1]