ANN Models and their Implications in Content Extraction

B  S  Charulatha; Paul Rodrigues  and T  Chitralekha

doi:10.17485/ijst/2016/v9i30/95283

Article

ANN Models and their Implications in Content Extraction

VIEWS 874
PDF 652

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i30/95283

Year: 2016, Volume: 9, Issue: 30, Pages: 1-5

Original Article

ANN Models and their Implications in Content Extraction

B. S. Charulatha^1*, Paul Rodrigues² and T. Chitralekha³

¹JNTUK, [email protected]
² King Khalid University, Saudi Arabia
³Central University, Kalapet – 605014, Puducherry, India
*Author for correspondence
Charulatha
JNTUK,
Email:[email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: Internet is the repository of information, which contains enormous information about the past, present which can be used to predict future. To know the unknown users are inclined towards searching the internet rather than referencing the library because of ease of availability. This requirement initiates the need to find the content of a web page with in shortest period of time irrespective of the form the page is. So information and content extraction need to be at a basic generic level and easier to implement without depending on any major software Methods: The study aims on extraction of information from the available data after the data is digitized. The digitized data is converted to pixel- maps which are universal. The pixel map will not face the issues of the form and the format of the web page content. Statistical method is incorporated to extract the attributes of the images so that issues of language hence text-script and format do not pose problems, the extracted features are presented to the Back Propagation algorithm. Findings: The accuracy is presented and how the content extraction within certain bounds could be possible Tested using unstructured word sets chosen from web pages. The method is demonstrated for mono lingual, multi-lingual and transliterated documents so that the applicability is universal. Applications/Improvement: The method is generic, uses pixel-maps of the data which is software and language independent.
Keywords: Back Propagation, Content Extraction, Information, Statistical, Deterministic