Total views : 525

A Content Filtering Scheme in Social Sites


  • Quaid-E-MillathGovt College for Women, Chennai - 600002, Tamil Nadu, India
  • Department of Computer Science, Quaid-E-MillathGovt College for Women, Chennai - 600002, Tamil Nadu, India


Objectives: In recent scenario, online social networks such as Face book, Twitter and Google+ have become one of the fastest emerging e-services. There are several issues affected these e-services. Since it is emerging service and reliability to communicate, in social networks privacy is often a key concern by the users. Since millions of people are willing to interact with others, it is also a new attack ground for malware creators. Some users and pages spreading malicious content and sending spam messages by taking advantage on the users’ inherent trust in their relationship network. Methods:This proposed work handles the most prevalent issues and threats targeting different social networks recently. And finally finds the authentication scheme for those attacks. This proposes a detecting and blocking scheme for social sites using data mining techniques. Findings: This system helps to detect suspicious URLs for social network by considering the following parameters, i).Text and keywords appears in the URL. ii). URL descriptions iii). Detection of scam messages which is done in manual script attacks on social sites. Application/Improvement: This performs two techniques which are message filtering and MLE (Maximum Likelihood Estimation).


Feature Selection, Machine Learning, Security, Social Network

Full Text:

 |  (PDF views: 382)


  • Trusov M, Bucklin RE, Pauwels K. Effects of word-of-mouth versus traditional marketing: Findings from an internet social networking site. Journal of Marketing. 2009; 73(5):90–102.
  • Jensen D, Neville J. Data mining in social networks. 2003.
  • Zuber M. A Survey of data mining techniques for social network analysis. International Journal of Research in Computer Engineering and Electronics. 2014; 3(6).
  • Aggarwal C. An introduction to social network data analytics. US: Springer; 2011.
  • Asur S, Huberman B. Predicting the future with social network. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WIIAT);2010. p. 1.
  • Au Yeung CM, Iwata T. Strength of social influence in trust networks in product review sites. Proceedings of the fourth ACM International Conference on Web Search and Data Mining, ACM; 2011. p. 495–504.
  • Bekkerman R, Mc Callum A. Disambiguating web appearances of people in a social network. In Proceedings of the 14th International Conference on World Wide Web, ACM;2005. p. 463–70.
  • Castellanos M, Dayal M, Hsu M, Ghosh R, Dekhil M. U LCI: A social channel analysis platform for live customer intelligence.Proceedings of the 2011 International Conference on Management of Data. 2011.
  • Chelmis C, Prasanna VK. Social networking analysis: A state of the art and the effect of semantics. Privacy, security, risk and trust (passat). IEEE Third International Conference on Social Computing (socialcom); 2011.
  • Barabasi A-L, Jeong H, Neda Z, Ravasz E, SchubertA, Vicsek T. Evolution of the Social Network of Scientific Collaborations. Physica A: Statistical Mechanics and its Applications.2002; 311(3):590–614.
  • Yang B, Cheung W, Liu J. Community Mining from Signed Social Networks, Knowledge and Data Engineering, IEEE Transactions. 2007; 19(10):1333–48.
  • Bonchi F, Castillo C, Gionis A, Jaimes A. Social Network Analysis and Mining for Business Applications. ACM Transactions on Intelligent Systems and Technology (TIST).2011; 2(3):22.
  • Chen J, Fagnan J, Goebel R Rabbany R, Sangi F, Takaffoli M, Verbeek E, Zaiane O. Meerkat: Community Mining with Dynamic Social Networks. IEEE International Conference on Data Mining Workshops (ICDMW); 2010. p. 1377–80.
  • Gundecha P, Liu H. Mining Social Media: A Brief Introduction. Tutorials in Operations Research. 2012; 1(4).
  • Fawcett T. In vivo’ spam filtering: a challenge problem for data mining. KDD Explorations. 2003 Dec; 5(2).
  • Airoldi E, Malin B. Data mining challenges for electronic safety: The case of fraudulent intent detection in e-mails. Proceedings of the Privacy and Security Aspects of Data Mining Workshop, in conjunction with the 4th IEEE Internation Conference on Data Mining; Brighton, England. 2004 Nov. p. 57–66.
  • Tretyakov K. Machine learning techniques in spam filtering. Institute of Computer Science, University of Tartu Data Mining Problem-oriented Seminar, MTAT. 2004; 3:60–79.
  • Bratko A, Filipic B. Spam filtering using character-level markov models: Experiments for the TREC 2005 spam track.Text Retrieval Conference; 2005.
  • Cournane A, Hunt R. An analysis of the tools used for the generation and prevention of spam. Computers and Security.2004; 23(2):154–66.
  • Bratko A, Cormack GV, Filipic B, Lynam TR, Zupan B. Spam filtering using statistical data compression models. Journal of Machine Learning Research 7. 2006 Dec; 2699–720.
  • Androutsopoulos I, Koutsias J, Konstantinos V, Chandrinos V, Paliouras G, Spyropoulos C. An evaluation of naive Bayesian anti-spam filtering. In: Potamias G, Moustakis V, van Someren M, editors. Proceedings of the ECML 2000 Workshop on Machine Learning in the New Information Age; 2000. p. 9–17.
  • Leonard D. E-mail threats increase sharply. IDG News Service.2002 Dec12.
  • Androutsopoulos I, Koutsias J, Paliouras G, Karkaletsis V, Sakkis G, Spyropoulos C, Stamatopoulos P. Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. 4th PKDD workshop on machine learning and textual information access; 2000.
  • Androutsopoulos I, PaliourasG, MichelakisE. Learning to filter unsolicited commercial email. Tech rpt 2004/2, NCSR Demokritos, 2004.
  • DruckerHD, WuD, VapnikV.Support vector machines for spam categorization. IEEE Transactions On Neural Networks.1999; 10(5):1048–54.
  • Gee KR. Using latent semantic indexing to filter spam. Proceedings of the 2003 ACM Symposium on Applied Computing (SAC), ACM; 2003. p. 460–4.
  • Delany SJ, Cunningham P, Tsymbal A, Coyle L. A casebased technique for tracking concept drift in spam filtering. Knowledge-Based Systems. 2005; 18(4-5):187–95,
  • Pantel P, Lin D. SpamCop: A spam classification and organization program. Learning for Text Categorization—Papers from the AAAI Workshop, Madison Wisconsin.(AAAI Technical Report WS-98-05). 1998; 95–98.
  • Sahami M, Dumais S, Heckerman D, Horvitz E. A bayesian approach to filtering junk email. AAAI-98 Workshop on Learning for Text Categorization. Madison, Wisconsin. (AAAI Technical Report WS-98-05). 1998; 55–62,
  • Carpinter J, Hunt R. Tightening the net: A review of current and next generation spam filtering tools. Computers and Security. 2006; 25(8):566–78
  • Provost F, Fawcett T, Kohavi R. The case against accuracy estimation for comparing induction algorithms. In:ShavlikJ, editor. Proceedings of ICML-98; San Francisco, CA: Morgan Kaufmann; 1998. p. 445–3.
  • Kolcz A, Alspector J. SVM-based filtering of e-mail spam with content-specific misclassification costs. Proceedings of TextDM’2001, IEEE ICDM-2001 Workshop on Mining; San Jose, CA. 2001.
  • Khongbantabam SD. A New Feature Selection Algorithm for efficient spam filtering using Adaboost and Hashing Technique. Indian Journal of Science and Technology. 2015;8(13):65753.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.