Language independent statistical approach for extracting keywords

Md. Mahfuzur Rahaman , Ruhul Amin

Sep 30, 2017

PDF

Abstract

This article focuses on the keyword extraction process for a document collection. A lot of works have been done on this topic but the performance of those procedures is not satisfactory. Some of those works followed supervised approach using specific corpus whether others used unsupervised approach. Meanwhile, many of those methods work only for a single document. In this paper, we described the simplified steps for extracting keywords from a specific document collection. This method is language independent and can be used for extracting keywords from any language. It uses term frequency, document frequency, chi-square distribution and Tseng’s algorithm as part of the whole extraction process.

Type

Conference paper

Publication

2017 4th International Conference on Advances in Electrical Engineering (ICAEE), IEEE

Date

September, 2017