Indian Journal of Science and Technology
DOI: 10.17485/ijst/2015/v8i32/77056
Year: 2015, Volume: 8, Issue: 32, Pages: 1-11
Original Article
Kolla Bhanu Prakash1,2*
1 Faculty of Computer Science Engineering, Sathyabama University, Chennai - 600119, Tamil Nadu, India
2 Faculty of Computing, Chirala Engineering College, Chirala - 523157, Andhra Pradesh, India [email protected]
Recent developments in information technology are mostly in areas where information, content creation and knowledge integration are the driving forces. Beginning with adjusting to complexities in internet and mobile communications, these developments are becoming significant sources of knowledge and expertise creators and this is where countries like India and China play a major role. Indian tradition is considered more than 5000 years old and proofs of some of this are available even now on written, oral and real forms like Mahabharata on text or Mohenjo-Daro-Harappa as structures. This study presents issues at extracting information from traditional Indian documents and a method of evaluating content as language, script and form of the web documents are significantly varied. The development is based on pixel level to make the approach generic and presents results for some basic issue at text level and how this can be extended to word and document level.
Keywords: Attribute Generation, Data Mining, Data Preparation, Information Extraction, Tradition, Voxel
Subscribe now for latest articles and news.