• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2015, Volume: 8, Issue: 32, Pages: 1-11

Original Article

Mining Issues in Traditional Indian Web Documents

Abstract

Recent developments in information technology are mostly in areas where information, content creation and knowledge integration are the driving forces. Beginning with adjusting to complexities in internet and mobile communications, these developments are becoming significant sources of knowledge and expertise creators and this is where countries like India and China play a major role. Indian tradition is considered more than 5000 years old and proofs of some of this are available even now on written, oral and real forms like Mahabharata on text or Mohenjo-Daro-Harappa as structures. This study presents issues at extracting information from traditional Indian documents and a method of evaluating content as language, script and form of the web documents are significantly varied. The development is based on pixel level to make the approach generic and presents results for some basic issue at text level and how this can be extended to word and document level.
Keywords: Attribute Generation, Data Mining, Data Preparation, Information Extraction, Tradition, Voxel

DON'T MISS OUT!

Subscribe now for latest articles and news.