• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 38, Pages: 1-7

Original Article

Improved Parallel PageRank Algorithm for Spam Filtering

Abstract

Background/Objectives: PageRanking algorithm is a well known link based technique given by Google for indexing of its web pages. This algorithm works on the linking structure of web pages id est inbound and outbound links of pages. The existing Page Rank algorithm follows equal distribution law that is; it distributes the Page Rank of a web page evenly among all the outgoing links. The problem with the uniform distribution of Page Rank is that sometimes uninteresting pages got high Page Rank values. Methods/Statistical Analysis: This paper proposed an improved parallel Page Rank algorithm that un-uniformly distributes the Page Rank values among all the outgoing links. The proposed work has been implemented on NVIDIA Quadro 2000 GPU architecture using CUDA programming language. Findings: The proposed algorithm mitigates spam and provides better results in terms of computational time as compared to Parallel Page Rank, because it assigns higher priority to important pages and less priority to less important web pages. By assigning values in such a fashion important pages show an increase in the Page Rank value and unrelated pages that is spam pages show a decrease in Page Rank value. Application: The proposed work performs spam filtering by classifying important as well as irrelevant web pages. 
Keywords: CUDA, GPU, Non-Uniform Distribution, Parallel Page Rank, Spam Pages  

DON'T MISS OUT!

Subscribe now for latest articles and news.