面向中文学术文本的单文档关键短语抽取 *
夏天

Extracting Key-phrases from Chinese Scholarly Papers
Xia Tian
表1 数据集中关键短语的统计信息
Table 1 Key Phrase Statistics in the Dataset
构词数量 平均字符长度 出现次数 占比 累计占比
1 3.34 20 303 28.07% 28.07%
2 4.33 39 028 53.95% 82.02%
3 5.95 10 005 13.83% 95.85%
4 7.46 2 142 2.96% 98.81%
5 9.48 476 0.66% 99.47%
6 10.55 218 0.30% 99.77%
7 12.65 79 0.11% 99.88%
8 15.59 37 0.05% 99.93%
9 17.07 14 0.02% 99.95%
10 16.18 22 0.03% 99.98%
其他 - 13 0.02% 100.00%