preview

Syntactic Matching Essay

Decent Essays

3.5.1 Syntactic Matching Standard syntactic matching methods used to compare the string, numeric and character type attributes of the two identities. Given Twitter identity and a candidate identity returned from Profile Search, Content Search, Network Search and Self-Mention Search, we used distance metric to compare their username and name attributes. Closer the match, smaller is the value of distance metric. 3.5.2 Image Matching There have been instances where users put same profile image on their multiple online identities. It is therefore easier to infer that identities with closest profile image match, belong to the same user. We used standard RGB-histogram image matching algorithm, to generate a score between profile image of the …show more content…

Finding Nemo takes a Twitter identity as input and run profile, content, self-mention and network based identity search methods. Candidate identities returned by each method are collected. If there exists an identity returned by more than one search method or if an identity is exposed via URL attribute of the Twitter identity (self-identification), the identity is returned as the correct Facebook identity. The reason for such a decision is that if an identity herself declares her on other social network via URL attribute, any matching methods are not necessary. Further, if a candidate identity is returned by more than one method, the returned candidate identity is similar to the queried Twitter identity, in more than one aspect, thereby strengthening the fact that the candidate identity is correct Facebook identity of the queried Twitter identity. In all other cases, candidate identities of multiple search methods are collated together and are ranked using identity matching methods syntactic (username, name), image (profile image). The ranked candidate identities are then presented to a human verifier to locate the correct Facebook identity out of the ranked candidate identity set, if exists. Since we observed that the manual verifiers have to bear less cognitive load in order to identify a match, when the ranked candidate identities are presented …show more content…

For the purpose of this research, we have used two separate datasets: a training dataset and a testing dataset. Both datasets have been obtained by the method of crawling real user profiles from popular social networking websites Facebook and twitter. Both datasets have been manually checked for overlapping users before the evaluation took place. For the purpose of our research, which is concerned with user identification based on profile comparisons, we have only considered overlapping users with an available profile to be potentially identifiable by the algorithm. Overlapping users who did not share any profile information were not considered as

Get Access