• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2017, Volume: 10, Issue: 8, Pages: 1-15

Original Article

Improving Triangle-Graph Based Text Summarization using Hybrid Similarity Function

Abstract

Objective: Extractive Summarization, extracts the most applicable sentences from the main document, while keeping the most vital information in the document. The Graph-based techniques have become very popular for text summarisation. This paper introduces a hybrid graph based technique for single-document extractive summarization. Methods/Statistical Analysis: Prior research that utilised the graph-based approach for extractive summarisation deployed one function for computing the necessary summary. Nonetheless, in our work, we have recommended an innovative hybrid similarity function (H), for estimation purpose. This function hybridises four distinct similarity measures: cosine similarity (sim1), Jaccard similarity (sim2), word alignmentbased similarity (sim3) and the window-based similarity measure (sim4). The method uses a trainable summarizer, which takes into account several features. The effect of these features on the summarization task is investigated. Findings: By combining, the traditional similarity measures (Cosine and Jaccard) with dynamic programming approaches (word alignment-based and the window-based) for calculating the similarity between two sentences, more common information were extracted and helped to find the best sentences to be extracted in the final summary. The proposed method was evaluated using ROUGE measures on the dataset DUC2002. The experimental results showed that specific combinations of features could give higher efficiency. It also showed that some features have more effect than others on the summary creation. Applications/Improvements: The performance of this new method has been tested using the DUC 2002 data set. The effectiveness of this technique is measured using the ROUGE score, and the results are promising when compared with some existing techniques.

Keywords: Extractive Summarization, Feature Extraction, Graph-Based Summarization, Hybrid Similarity, Sentence Similarity, Triangle Counting

DON'T MISS OUT!

Subscribe now for latest articles and news.