Assignment -3
Literature Review
Conversion of the XML Schema to Data Warehouse Schema
Introduction: eXtensible Markup Language is used mainly in most of the organizations for e-commerce and online applications. Indeed, XML has become the standard for representing, exchanging the data among the various applications on the internet. Moreover, XML schema is used for representing the XML document structure where XML data is associated with the XML schema. Furthermore, data warehouse provides tools which business use the data for making the important decisions. Correspondingly, data is stored in the fact table and multidimensional tables. Mainly, the table association between them are generally represented with the three data warehouse schemas like a) star schema b) fact constellation schema c) snowflake schema. Simultaneously, the use of the internet is increasing day by day and by first integrating the data and secondly converting the data into XML schema from the schema graph to the various data schemas. At first, schema graph is taken as the model for the conversion of the data that is extracted from the XML schema and the data is transformed into the various schema. Consequently, the data warehouse schema is constructed with these fact tables, dimension tables and the relation existing between the graph and tables.
Mainly, in the data warehouse analyzing the large data helps the decision-making process. Indeed, in the data warehouse, the integration of the data from the
The data warehouse contains all the information that both the chain managers as well personnel can access. This information helps them see which products are selling, how much, where more important points of sales are, which are needed in inventory and which items needs to be checked for quality etc. Similarly these databases also contain solid information about consumers such as what is the ratio of repeat customers, what age group needs to be targeted for advertising, which new group is emerging and how to stay in touch with consumers about new products and sales.
Data management is vital to any business as this is a key tool to an organisations business improvement, as you can refer back to data, and compare them against benchmarks. Analysing data can provide evidence for possible future structure such as identify trends, as well as indicate where improvements can be made. However there are strict procedures to be followed when collecting and storing data.
A data warehouse is a large databased organized for reporting. It preserves history, integrates data from multiple sources, and is typically not updated in real time. The key components of data warehousing is the ability to access data of the operational systems, data staging area, data presentation area, and data access tools (HIMSS, 2009). The goal of the data warehouse platform is to improve the decision-making for clinical, financial, and operational purposes.
Data warehouse Analysts are bridging the present business aspect with the future. Their role is critical to a company’s ability to make sound business decisions. A Data Warehouse Analyst is responsible for data design, database architecture, and metadata and
Data warehouses, in contrast, are targeted for decision support. Historical, summarized and consolidated data is more important than detailed, individual records. Since data warehouses contain consolidated data, perhaps from several operational databases, over potentially long periods of time, they tend to be orders of magnitude larger than operational databases; enterprise data warehouses are projected to be hundreds of gigabytes to terabytes in size. The workloads are query intensive with mostly ad hoc, complex queries that can access millions of records and perform a lot of scans, joins, and aggregates. Query throughput and response times are more important than transaction throughput.
With importance of data, it would benefit business employees preserve information by improving the database design that stores it all. While information is being stored sufficiently as is, database design can be processed through normalization to improve its organization.
Businesses today continue to strive and grow in the industry to keep up with the never ending changes in the business they need the tools to obtain information that can be used to make decisions for the business. The decisions to make in a business can consist of knowing what geographic region to focus on, which product lines to expand, and what markets to strengthen in the industry. To obtain the type of information that has the proper content and format that can assist with strategic decisions they turned to data warehousing. It became the new paradigm intended specifically for vital strategic information.
"A data warehouse is a subject oriented, integrated, time variant, non-volatile collection of data in support of management 's decision making process". Source
Data warehousing is defined as the design and implementation of processes and tools to manage and deliver complete, timely, accurate, and understandable data for decision making. It includes all the activities that make it possible for an organization to create, manage, and maintain a data warehouse or data mart (Williams & Williams, 2007). Data warehousing majorly deals with managing the development, the implementation, and the operation of a data warehouse or data store. It includes metadata management, data acquisition, data archiving, data cleansing, storage management, data integration, data distribution, security management operational
In Philip Russom’s webinar he provides an overview of what a Data Warehouse (DW) modernization is, why many users’ DWs need modernization. The top five most common reasons for DW modernization including: Advanced Analytics, Scale, Speed, Productivity and Cost Control, what is the result from modernization, and his recommendations
Data Warehousing also known in many industries as an Enterprise Data Warehouse is a system that contains a central repository of integrated data, often collected from multiple sources and is used to perform data analysis enabling the creation of detailed reports that contribute significantly to a corporation’s business intelligence. Data Warehousing emerged as a result of advances in the field of information systems over the last several decades. There are two major factors that drive the need for data warehousing in most organizations. First and foremost, businesses require an integrated, company-wide view of high-quality information to maintain and improve upon their strategic position. Secondly, information systems departments must separate information from operational systems to improve performance dramatically in managing company data. Critical to the success of a Data Warehousing system, Data mining allows for companies to create customer profiles, manipulate information easily, and provide knowledgeable access to the current state of their company. However, a reality that many companies often find out the hard way is that data mining and data warehousing does not work for them. As with many new tools or technology, companies may jump on the bandwagon without fully contemplating its potential weaknesses. In order to remain competitive in today’s business world, companies should consider implementing data warehouses, but only with
The data warehouse comes ready for use, but an organization has to get prepared to use it. The main factor is data warehouse usage. A data warehouse can be used for decision making for management staff.
Summary: The text book I have chosen is “The Data Warehouse Toolkit” third edition, written by Ralph Kimball and Margy Ross. This book mainly involves on techniques to develop the business in real-time. As the authors had a lot of experience because of their work from 1980’s, they have seen both the growth and failures of the companies in the market. Chapters in this text book involves goals of data warehousing which include Data staging area, data presentation, data access tools. Kimball modeling techniques involves gathering business requirements and data realities, business processes, different table techniques. Case studies in retail sales are explained in this text book, four step dimensional design process which includes the design process with the help of different dimensions and facts. In order management chapter it deals with the business processes that to be implemented in data warehouses as they supply core business performances metrics and finally provide the real time warehousing requirements. Customer relationship management involves in improving the customer relation with the company or product, understanding the needs of customer and providing high level service is the goal of this chapter. In accounting, we deal with model of general ledger information for the data warehouse, it describe the years and dates at which things to be happened and show different dimensional models which helps to combine the data from
Data warehousing is one of the hottest industry trends - for good reason. A well-defined and properly implemented data warehouse can be a valuable competitive tool. (Perkins).
Data warehouse are multiple databases that work together. In other words, data warehouse integrates data from other databases. This will provide a better understanding to the data. Its primary goal is not to just store data, but to enhance the business, in this case, higher education institute, a means to make decisions that can influence their success. This is accomplished, by the data warehouse providing architecture and tools which organizes and understands the