Unsupervised Keyword Extraction From Polish Legal Texts

Abstract
In this work, we present an application of the recently pro- posed unsupervised keyword extraction algorithm RAKE to a corpus of Polish legal texts from the field of public procurement. RAKE is essen- tially a language and domain independent method. Its only language- specific input is a stoplist containing a set of non-content words. The performance of the method heavily depends on the choice of such a stoplist, which should be domain adopted. Therefore, we complement RAKE algorithm with an automatic approach to selecting non-content words, which is based on the statistical properties of term distribution.
Description
Keywords
Citation
Belongs to collection