Indian Journal of Science and Technology
Year: 2017, Volume: 10, Issue: 8, Pages: 1-15
Yazan Alaya AL-Khassawneh1*, Naomie Salim1 and Mutasem Jarrah1,2
1Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia; [email protected], [email protected] 2Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia; [email protected]
*Author for the correspondence:
Yazan Alaya AL-Khassawneh
Faculty of Computing, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia; [email protected]
Objective: Extractive Summarization, extracts the most applicable sentences from the main document, while keeping the most vital information in the document. The Graph-based techniques have become very popular for text summarisation. This paper introduces a hybrid graph based technique for single-document extractive summarization. Methods/Statistical Analysis: Prior research that utilised the graph-based approach for extractive summarisation deployed one function for computing the necessary summary. Nonetheless, in our work, we have recommended an innovative hybrid similarity function (H), for estimation purpose. This function hybridises four distinct similarity measures: cosine similarity (sim1), Jaccard similarity (sim2), word alignmentbased similarity (sim3) and the window-based similarity measure (sim4). The method uses a trainable summarizer, which takes into account several features. The effect of these features on the summarization task is investigated. Findings: By combining, the traditional similarity measures (Cosine and Jaccard) with dynamic programming approaches (word alignment-based and the window-based) for calculating the similarity between two sentences, more common information were extracted and helped to find the best sentences to be extracted in the final summary. The proposed method was evaluated using ROUGE measures on the dataset DUC2002. The experimental results showed that specific combinations of features could give higher efficiency. It also showed that some features have more effect than others on the summary creation. Applications/Improvements: The performance of this new method has been tested using the DUC 2002 data set. The effectiveness of this technique is measured using the ROUGE score, and the results are promising when compared with some existing techniques.
Keywords: Extractive Summarization, Feature Extraction, Graph-Based Summarization, Hybrid Similarity, Sentence Similarity, Triangle Counting
Subscribe now for latest articles and news.