The Success Of The Database And Data Warehouse

Decent Essays

The success of the database and data warehouse (DW) project really depends on the quality of data. If data quality is not good enough, the information will logically be unreliable when the business users retrieve it from the database/DW environment. Good quality of data will be useful for the decision maker to make the right decision, gain more trust and make the organization more efficient. In contrast, the bad quality of data will drive the decision maker to make a wrong decision. Debbarma, Nath and Das (2013) stated that “good quality of data will enable DW environment to provide right information in the right place at the right time with the right cost in order to support the right decision”. Thus, data quality needs to be maintained …show more content…

The most prominent issue that always occurs is duplicated records “the records that represent the same real-world object in numerous ways” (Christie, Timothy, 2005). Such duplicates could cause many significant problems. Therefore, data cleaning strategy is essential to ensure these redundancies. However, duplicate elimination is one of challenge tasks because it is caused by many different types of errors such as typographical errors, null values, abbreviations, word transformation and different representations of the same word. To detect and eliminate data duplication, there are many algorithms that have been proposed by researchers and scholars. Those algorithms include standard duplication elimination algorithm (SDE), adaptive duplication detection algorithm (ADD), sorted neighborhood algorithm (SNA), duplicate elimination sorted neighborhood algorithm (DE-SNA), etc. Most of the algorithms use the following techniques in different ways to achieve a duplicate detection and elimination goal: Character-based similarity measure techniques. Phonetic similarity measure techniques Numeric similarity measure techniques Semantic similarity measure techniques To measure the similarity between characters, numbers and semantic words, the above techniques use standard string similarity functions such as edit distance, generalized edit distance, hamming distant, cosine metric, and Jaccard coefficient function, etc. There are many approaches

Get Access

The Success Of The Database And Data Warehouse

Pm3110 Unit 2 Term Paper

Pm3110 Unit 2 Term Paper

Collecting HR Data

Collecting HR Data

Bis/531 Research Assignment

Bis/531 Research Assignment

Itkm 548 Master Data Management Paper

Itkm 548 Master Data Management Paper

Canadian Tire Case Essay

Canadian Tire Case Essay

Computer Assisted Coding

Computer Assisted Coding

Data Mining Case Study

Data Mining Case Study

Integrity In Healthcare

Integrity In Healthcare

Using Lucene And Clustering Libraries

Using Lucene And Clustering Libraries

Sharepoint: Uncovering Data Integrity Challenges

Sharepoint: Uncovering Data Integrity Challenges

The Healthcare Information And Management Systems Society Essay

The Healthcare Information And Management Systems Society Essay

Data Mining Essay

Data Mining Essay

Business Analysis : Organizing Data And Data Crunching

Business Analysis : Organizing Data And Data Crunching

The Return on Investment of Data Warehousing Essay

The Return on Investment of Data Warehousing Essay

Data Warehousing And Data Mining Essay

Data Warehousing And Data Mining Essay

Related Topics