A Graph Degeneracy-based Approach to Keyword Extraction

Antoine Tixier1, Fragkiskos Malliaros2, Michalis Vazirgiannis2
1Ecole Polytechnique, Palaiseau, France, 2Ecole Polytechnique


Abstract

We operate a change of paradigm and hypothesize that keywords are more likely to be found among influential nodes of a graph-of-words rather than among its nodes high on eigenvector-related centrality measures. To test this hypothesis, we introduce unsupervised techniques that capitalize on graph degeneracy. Our methods strongly and significantly outperform all baselines on two datasets (short and medium size documents), and reach best performance on the third one (long documents).