Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 33, Pages: 1-7
Arun Yadav1* and Divakar Yadav2
1 Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad – 201009, Uttar Pradesh, India; [email protected]
2 Department of Computer Science, Jaypee Institute of Information Technology, Noida – 201309, Uttar Pradesh, India; [email protected]
Background/Objectives: There is significant commercial and research interest in location based search for search engines. Searching of keywords belonging to one or more locations (geographic references) requires geographical web search and ranking on the basis of spatial and textual relevancy. This type of search sets the requirement of spatial and textual indexing. Methods/Statistical Analysis: This paper uses a new spatial-textual hybrid indexing technique, based on Wavelet Tree (WT) to handle point and region queries for Geographical Information Retrieval. Here, WT data structure is used for both textual and spatial indexing. Minimum Bounding Rectangles (MBRs) of different geographical points (latitude, longitude) is created for designing hybrid index. For searching textual keywords, we need to design inverted index. It is created using wavelet tree. Also, a spatial-textual relevancy scheme is used for relevant document retrieval to the end users. Findings: The algorithm has been implemented in order to measure the performance in terms of search time. Approximately 40,000 Wikipedia pages have been crawled and stored in database along with geographical coordinates (latitude, longitude) of locations in India to design MBRs of these locations. The results show that wavelet tree based hybrid index algorithm performance increase with the increase in query length. For small query length, B/R* tree performs better but for larger query lengths, wavelet tree based hybrid index outperforms other techniques. Precision and recall of web documents have also been calculated using hybrid index. For varying query lengths, the precision and recalls are varying which shows that by reducing the time in search time precision and recall are preserve. Applications/Improvement: Our algorithm outperforms the existing algorithms both in terms of simplicity in implementation and searching time. In future we will propose a compression technique on hybrid index to minimize the space taken by hybrid index that will further improve the searching time in case of single as well as multiple geographical references of documents.
Keywords: Hybrid-indexing, Indexing, Information Retrieval, Wavelet Tree
Subscribe now for latest articles and news.