Abstract
Multimedia data mining is a popular research domain which helps to extract interesting knowledge from multimedia data sets such as audio, video, images, graphics, speech, text and combination of several types of data sets. Normally, multimedia data are categorized into unstructured and semi-structured data. These data are stored in multimedia databases and multimedia mining is used to find useful information from large multimedia database system by using various multimedia techniques and powerful tools. This paper provides the basic concepts of multimedia mining and its essential characteristics. Multimedia mining architectures for structured and unstructured data, research issues in multimedia mining, data mining models used for
…show more content…
Text data can be used in web browsers, messages like MMS and SMS. Image data can be used in art work and pictures with text still images taken by a digital camera. Audio data contains sound, MP3 songs, speech and music. Video data include time aligned sequence of frames, MPEG videos from desktops, cell phones, video cameras [17]. Electronic and digital ink its sequence of time aligned 2D or 3D coordinates of stylus, a light pen, data glove sensors, graphical, similar devices are stored in a multimedia database and use to develop a multimedia system. Figure 1 gives the important components of multimedia data mining. Figure 1.Multimedia Data Mining
Text mining Text Mining also referred as text data mining and it is used to find meaningful information from the unstructured texts that are from various sources. Text is the foremost general medium for the proper exchange of information [3]. Text Mining is to evaluate huge amount of usual language text and it detects exact patterns to find useful information.
Image mining
Image mining systems can discover meaningful information or image patterns from a huge collection of images. Image mining determines how low level pixel representation consists of a raw image or image sequence can be handled to recognize high-level spatial objects and relationship [14]. It includes digital image processing, image understanding,
Thus our proposed optimal feature subset selection based on multi-level feature subset selection produced better results based on number of subset feature produced and classifier performance. The future scope of the work is to use these features to annotate the image regions, so that the image retrieval system can retrieve relevant images based on image semantics.
Data Mining. It is the process of discovering interesting knowledge that are gathered and significant structures from large amounts of data stored in data warehouse or other information storage.
Text mining sometimes known as text data mining often refers to the process of pulling out of interesting and non-trivial patterns of knowledge form a semi or unstructured text document. Text mining can also serve as an extension of data mining or of data finding from a structures database. With text mining it can be the same as data mining but with a bit more complexity, because they somewhat carry out the same processes and has the same purpose, however with text mining the data is more unstructured rather that structured in the data files such as : (pdf, word, xml etc.). This is so because most people store information in the form of text, it is believed that text mining can be greater than data mining, during recent years there where a number of studies done which indicates that 80% of business information is stored in text format.
Data mining is the process through which previously unknown patterns in data were discovered. Another definition would be “a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques to extract and identify useful information and subsequent knowledge from large databases.” This includes most types of automated data analysis. A third definition: Data mining is the process of finding mathematical patterns from (usually) large sets of data; these can be rules, affinities, correlations, trends, or prediction models.
When manipulating massive image databases, a good indexing is necessary. Processing every single item in a database, when performing queries, is extremely inefficient and slow. When working with images, the feature vectors are used as the basis of the index. Popular multi-dimensional indexing methods include the R-tree and the R*-tree algorithms (Long et al., 2003). The Self Organizing Map (SOM) is also one of the indexing structures (Laaksonen et al., 2000). Usage of indexing techniques during searching reduces processing time and thus retrieves images quickly.
In this dissertation a multimedia big data analysis framework for semantic information management and retrieval is presented. It contains three coherent components, namely multimedia semantic representation, multimedia concept classification and summarization, and multimedia temporal semantics analysis and ensemble learning. These three components are seamlessly integrated and act as a coherent entity to provide essential functionalities in the proposed information management and retrieval framework. More specifically:
Globally, people are increasingly accessing content as easier access to information continues to explode rapidly. People not only access content (be it text, audio, still images, animation, video or interactivity content forms) but are themselves the producers of more and more digital data and with this comes a host of problems like content management, content reuse based on consumer and device capabilities, protection of rights and from unauthorised access or modification, privacy protection of both providers and consumers, etc [1] [2].
Text mining is generally identical to content examination it is the way toward getting brilliant data from text. Fantastic data is ordinarily determined through the concocting of examples and patterns through means, for example, factual example learning. Content mining as a rule includes the way toward organizing the information message typically parsing, alongside the expansion of some determined phonetic highlights and the expulsion of others, and ensuing inclusion into a database, inferring designs inside the organized information, lastly assessment and elucidation of the yield. 'High caliber' in content mining more often than not alludes to some blend of pertinence, oddity, and interesting .Text examination programming can help by transposing words and expressions in unstructured information into numerical esteems which would then be able to be connected with organized information in a database and broke down with conventional information mining techniques. Text mining is a minor departure from a field called information mining that tries to discover
Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data. There is a huge amount of data available in the Information Industry. This data is of no use until it is converted into useful information. It is necessary to analyse this huge amount of data and extract useful information from it. Extraction of information is not the only process we need to perform; data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation, Data Mining, Pattern Evaluation and Data Presentation [12].
Data mining is the procedure of getting new patterns from large amount of data. Data mining is a procedure of finding of beneficial information and patterns from huge data. It is also called as knowledge discovery method, knowledge mining from data, knowledge extraction or data/ pattern analysis. The main goal from data mining is to get patterns that were already unknown. The useful of these patterns are found they can be used to make certain decisions for development of their businesses. Data mining aims to discover implicit, already unknown, and potentially useful information that is embedded in data.
Text mining is a process which collects information and knowledge from large amounts of unstructured data sources. When I say unstructured data sources, I am talking about Pdf files, Word documents, XML files, text excerpts etc… Text mining collects information from text. Text mining is different than data mining because data mining is a process which collects information and knowledge from large amounts of structured data sources. Structured data sources means that data are classify by categorical, ordinal, or continuous variables, and the goal of data mining is to transform data into model or understandable structure after collecting information from data. However they are
To extract images from database by CBIR method, firstly user has to provide retrieval system and query image or sample facial image, after that retrieval system will perform operation on query image and change it into internal representation of feature vector. Then similar feature of feature between feature vectors of query image in database are calculated and retrieve the results in indexing form. The indexing results will help to find the image in dataset. This how the CBIR is working [14],[19] and also to more on study we used paper [18], which consist the study of 200 papers of CBIR method and parallel there was one more method is semantic image annotation, which overcome problems related with the CBIR method
computer helped Multimedia to achieve its higher performance. Multimedia concepts are used in different application. The
These necessities have prompted the conception of Data Mining that has been changing the live from the data age toward the coming information age. A considerable amount of literature has been published on Data Mining and the aim of this survey is concerned with the ideas behind the processes; purpose and techniques of Data Mining. [1][2]
The overall goal of the data mining process is to extract information from data sets and transform it into an understandable structure such as patterns and knowledge for further use [3].