Research & Publications

Abstract:Duplicate detection multiple representation of same entity. XML is widely used in almost all applications especially data in web. Dueto the wide usage of XML it is essential to identify duplicates in it. Various methods like normalization etc are used for duplicate detection in relational database but it cannot be employed in XML due to its complex structure. Detecting and eliminating duplicates correctly has become one of the challenging issues in the areas of places where data integration is performed. Many techniques have been emerged for detecting duplicates in both relational databases and XML data’s. By recognizing and eliminating duplicates in XML data could be the solution, for this a strategy based on Bayesian Network called XMLDup to detect duplicates and use machine learning algorithm like SVM, Bee, Bat algorithms for improving its efficiency and compare them to find out the most efficient method to find out duplicates in XML effectively.