Language independent statistical approach for extracting keywords

Abstract

This article focuses on the keyword extraction process for a document collection. A lot of works have been done on this topic but the performance of those procedures is not satisfactory. Some of those works followed supervised approach using specific corpus whether others used unsupervised approach. Meanwhile, many of those methods work only for a single document. In this paper, we described the simplified steps for extracting keywords from a specific document collection. This method is language independent and can be used for extracting keywords from any language. It uses term frequency, document frequency, chi-square distribution and Tseng’s algorithm as part of the whole extraction process.

Publication
2017 4th International Conference on Advances in Electrical Engineering (ICAEE), IEEE
Date