Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 31, Pages: 1-7
Hoang Do Thanh Tung1* and Dinh Duc Luong2
1 Institute of Information and Technology, Vietnam Academy of Science and Technology (VAST), [email protected]
2 Food Industrial College, Phu Tho, Vietnam
*Author for correspondence
Hoang Do Thanh Tung
Institute of Information and Technology
Today, the XML is used as data storage for complex data models like bioinformatics information. A bioinformatics system deals with large data sets and complex queries. Thus, it is necessary to have accessing methods for XML data. XPath is a method to quickly locate any information that we need in an XML (tree) data starting from the context node in a root node to subtrees. In this paper, we propose a system model to store XML data more efficiently and also an improved indexing method to support Xpath queries. In the system model, we integrated big data model with relational data model in order to get benefit from both of them. The new indexing method is an improvement of R-tree that helps Xpath queries run more efficiently in some axes. Our experiments showed that the proposed method gains better results for node queries compared to the R-tree in transformed XML data. Our method is intended to apply to phylogenetic queries of Treefam databases.
Keywords: Bioinformatics, Hadoop, Indexing, XML Data, Xpath Queries
Subscribe now for latest articles and news.