Introduction Data Warehousing and Data Mining has always been associated with manufacturing companies, where sales and profit is the main driving force. Subsequently Higher Education has grown throughout the years; this growth is predominately associated with the increase of online institutions. This growth has resulted in higher education to adapt to a more business like institution (Lazerson, 2000). Since higher education has blurred the lines with traditional businesses, it is important to have the tools to assist them with valuable data and information, in making decisions. Using of data and having the right data mining tools can insure the institute’s success, in many forms, such as, identifying market trends, precision marketing, new products, performance management, grants and funding management, student life cycle management and procurement to mention a few. To get a better grasp on these benefits it’s important to understand data warehouse, data mining and the associated benefits.
Data Warehouse Data warehouse are multiple databases that work together. In other words, data warehouse integrates data from other databases. This will provide a better understanding to the data. Its primary goal is not to just store data, but to enhance the business, in this case, higher education institute, a means to make decisions that can influence their success. This is accomplished, by the data warehouse providing architecture and tools which organizes and understands the
A data warehousing is defined as a collection of data designed to support management decision making. Data warehouses contains a wide variety of data that present a coherent picture of the business conditions at a single point in time. Development of a data warehouse includes development of the systems that extract data from operating systems plus the installation of the warehouse database system that provides managers flexible access to the data. The term data warehousing generally refer to the combination of many different databases across an entire enterprise. (webopidia)
1) Data mining is a way for companies to develop business intelligence from their data to gain a better understanding of their customers and operations and to solve complex organizational problems.
An example of how data mining is conducted and used to benefit business can be explained in the following scenario:
Growing up in a business background where my family had been in the international trade business for the last hundred years, I was always amazed to see how data science gradually involved in our family business. I have also gained insight into the data science tools and how data science improved our business decision-making and performance. During the past three years, I have found my post-graduation in Marketing and Finance comes out to support my success on my professional career path, where I experienced various school projects on the real expanse of marketing and developed my analytical skills. Moreover, studying Marketing, Finance, Statistics and Mathematics has evolved my analytical thinking, logical reasoning and decision making skills and incited me to delve further into the data science field.
Data mining allows companies to focus on the more important information in their data warehouses. Data mining can be broken down into two major categories. Automated prediction of trends and behaviors, and automated discovery of previously unknown patterns. In the first category, data mining automates the process of finding predictive information in large databases. Questions that traditionally required exhaustive hands-on analysis can now be quickly answered directly from data. In the second category, data mining tools sweep through databases and identify previously hidden patterns in one step. This category is where the major focus of research has been on.
Data, Data everywhere. It is a precious thing that will last longer than the systems. In this challenging world, there is a high demand to work efficiently without risk of losing any tiny information which might be very important in future. Hence there is need to create large volumes of data which needs to be stored and explored for future analysis. I am always fascinated to know how this large amount of data is handled, stored in databases and manipulated to extract useful information. A raw data is like an unpolished diamond, its value is known only after it is polished. Similarly, the value of data is understood only after a proper meaning is brought out of it, this is known as Data Mining.
Data warehouses, in contrast, are targeted for decision support. Historical, summarized and consolidated data is more important than detailed, individual records. Since data warehouses contain consolidated data, perhaps from several operational databases, over potentially long periods of time, they tend to be orders of magnitude larger than operational databases; enterprise data warehouses are projected to be hundreds of gigabytes to terabytes in size. The workloads are query intensive with mostly ad hoc, complex queries that can access millions of records and perform a lot of scans, joins, and aggregates. Query throughput and response times are more important than transaction throughput.
Data mining is “[t]he process of finding significant, previously unknown, and potentially valuable knowledge hidden in data” (Gordon, 2007). Organizations use data mining to sift through massive quantities of raw data in order to find patterns and relationships that will ultimately be used for business purposes (Definition of: Data mining, 2016). Organizations mainly use data mining to get a better idea of their customer’s purchasing habits, product preferences, etc. in order to create sales tactics targeted at a certain customer demographic (Definition of: Data management, 2016).
Data mining will have a different effect on different industries in the business world. In the telecommunications industry, for example, in order to retain
Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data. There is a huge amount of data available in the Information Industry. This data is of no use until it is converted into useful information. It is necessary to analyse this huge amount of data and extract useful information from it. Extraction of information is not the only process we need to perform; data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation, Data Mining, Pattern Evaluation and Data Presentation [12].
A Data Warehouse is simply a consolidation of data from a variety of sources that is designed to support strategic and tactical decision making. In other words, a data warehouse consist of different data sources provides access to the data that will expand frequency and depth of data analysis. Due to these reasons, data warehouse is the foundation for business intelligence. Its main purpose is to provide a coherent picture of the business at a point in time. Using various Data Warehousing toolsets, users are able to run online queries and 'mine" their data. Companies that build data warehouses and use business intelligence for decision-making ultimately save money and increase profit. Moreover, many successful companies have invested large sums of money in business intelligence and data warehousing tools and technologies. They believe that up-to-date, accurate and integrated information about their supply chain, products and customers are critical for their very survival.
The concept of data warehousing dates back to the late 1980s when IBM researchers Barry Devlin and Paul Murphy developed the “business data warehouse”. Data warehouse (DW) is an application which allowed you to execute ad-hoc queries; multi-dimensional analysis and query information by
There is currently a very strong need for a defined and standardized process to request work related to the data warehouse. Not only does the work requests need to be standardized the development methodology needs to be defined with an agile approach fitting best with the direction and culture of the company. By clearly defining steps to both of these things the efficiency of the team will be greatly improved. They should be able to complete more projects as well as improve the quality of the products they are producing. With less mistakes and errors and more work being completed the return on investment for the data warehouse technology will be greatly increased.
First 3 chapters of the book are written in a way that beginners may get clear view of the basic concepts. First chapter described the need regarding strategic information, information crisis, and that the data warehousing is a better solution for information crisis. Features and components of Data warehouse, along with the concept and need of metadata is described. Various trends in data warehouse are mentioned by the author based on his own industrial experience. Areas like Continued growth in data warehousing
Abstract— Data mining is logical process that is used to extract or “mining” large amount of data in order to find useful data [2]. Knowledge discovery from Data or KDD is synonym for Data Mining[13].There are many different types of techniques that can be used to retrieve information from large amount of data. Each type of technique will generate different results. The type of data mining technique that should be selected depends on the type of business problem that we are trying to solve.