1School of Economics & Management, Nanjing University of Science and Technology, Nanjing 210094, China 2School of Information Management, Nanjing University, Nanjing 210023, China
[Objective] This paper proposes a system to link the fragmented knowledge elements from an online community, aiming to help explore knowledge more effectively. [Methods] First, we built a domain knowledge base for the online community. Then, we combined units of the domain knowledge base with the semantically similar elements of the user-generated-content (UGC). Finally, we identified the knowledge units of the UGC and linked them with relevant Web pages. [Results] We examined the proposed method with a Chinese cardiovascular BBS site. A total of 2,211 cardiovascular concepts and 5,741 fine-grained relations were extracted to create the domain knowledge base. We identified the knowledge elements from 5,020 posts automatically and linked them with relevant webpages. [Limitations] Only investigated the linking of knowledge elements at the micro level. [Conclusions] The proposed system can effectively establish connections between knowledge units and UGC documents based on the existing resource organization schemes. The new method could be used in other fields.
Xu K, Chen Y, Jiang Y, et al.A Comparative Study of Correlation Measurements for Searching Similar Tags[C]// Proceedings of International Conference on Advanced Data Mining and Applications. Springer Berlin Heidelberg, 2008: 709-716.
(Yi Ming, Wang Xuedong, Deng Weihua.A Research on the Tag Network Analysis Based on Social Network Analysis (SNA) and the Personalized Information Service[J]. Journal of Library Science in China, 2010, 36(2): 107-114.)
(Yang Meng, Zhang Yunzhong, Xu Baoxiang.Review of Resources Aggregation and Navigation of Social Tagging System[J]. Information Studies: Theory & Application, 2014, 37(3): 140-144.)
[4]
Angeletou S.Semantic Enrichment of Folksonomy Tagspaces[C]//Proceedings of International Semantic Web Conference. Springer Berlin Heidelberg, 2008.
[5]
Specia L, Motta E.Integrating Folksonomies with the Semantic Web[C]//Proceedings of European Semantic Web Conference 2007: The Semantic Web: Research and Applications. 2007: 624-639.
[6]
Wang L, Jia Y, Han W.Instant Message Clustering Based on Extended Vector Space Model[C]//Proceedings of the 2nd International Symposium on Intelligence Computation and Applications (ISICA 2007), Wuhan, China. 2007: 435-443.
(Ma Huifang, Zeng Xiantao, Li Xiaohong, et al.Short Text Feature Extension Method of Improved Frequent Term Set[J]. Computer Engineering, 2016, 42(10): 213-218.)
doi: 10.3969/j.issn.1000-3428.2016.10.037
(Li Xiangdong, Cao Huan, Ding Cong, et al.Short-text Classification Based on HowNet and Domain Keyword Set Extension[J]. New Technology of Library and Information Service, 2015(2): 31-38.)
[10]
He H, Chen B, Xu W, et al.Short Text Feature Extraction and Clustering for Web Topic Mining[C]//Proceedings of the 3rd International Conference on Semantics, Knowledge and Grid. IEEE, 2007: 382-385.
(He Tao, Cao Xianbin, Tan Hui.An Immune Based Algorithm for Chinese Network Short Text Clustering[J]. Acta Automatical Sinica, 2009, 35(7): 896-902.)
doi: 10.3724/SP.J.1004.2009.00896
(Jin Chunxia, Zhou Haiyan.Chinese Short Text Clustering Based on Dynamic Vector[J]. Computer Engineering and Applications, 2011, 47(33): 156-158.)
doi: 10.3778/j.issn.1002-8331.2011.33.046
(Tian Bo, Fan Lingling.New Method of Community Detection for Online Social Networks Based on Interactive Behaviors[J]. Journal of Intelligence, 2016, 35(11): 183-188.)
doi: 10.3969/j.issn.1002-1965.2016.11.033
(Sun Yifan, Li Sai.Similarity-based Community Detection in Social Network of Microblog[J]. Journal of Computer Research and Development, 2014, 51(12): 2797-2807.)
doi: 10.7544/issn1000-1239.2014.20131209
(Liu Bingyu, Wang Cuirong, Wang Cong, et al.Microblog Community Discovery Algorithm Based on Dynamic Topic Model with Multidimensional Data Fusion[J]. Journal of Software, 2017, 28(2): 246-261.)
doi: 10.13328/j.cnki.j0s.005116
(Wang Zhijin.From Information Organization to Knowledge Organization[J]. Journal of the China Society for Scientific and Technical Information, 1998, 17(3): 230-234.)
doi: 10.3969/j.issn.1000-0135.1998.03.012
(Jiang Yongchang, Yang Hongyan, Zhang Libo.Research on Knowledge Organization Based on Knowledge Elements and the Service Functionality[J]. Information Studies: Theory & Application, 2007, 30(1): 37-40.)
doi: 10.3969/j.issn.1000-7490.2007.01.011
[21]
陈果. 基于领域概念关联的网络社区知识聚合研究[D]. 武汉: 武汉大学, 2015.
[21]
(Chen Guo.Research on the Knowledge Aggregation in Network Community Based on Domain Conceptual Relations[D]. Wuhan: Wuhan University, 2015.)
[22]
Medelyan O, Milne D, Legg C, et al.Mining Meaning from Wikipedia[J]. International Journal of Human-Computer Studies, 2008, 67(9): 716-754.
[23]
Clauson K A, Polen H H, Boulos M N, et al.Scope, Completeness, and Accuracy of Drug Information in Wikipedia[J]. Annals of Pharmacotherapy, 2008, 42(12): 1814-1821.
doi: 10.1345/aph.1L474
pmid: 19017825
(Ye Shengjun, Sun Jiqing, Li Nan.Research on Semantic Relationship Correlation of Chinese Terminology Based on Morpheme Theory[J]. Library Journal, 2017, 36(1): 80-87.)
[26]
Hearst M A.Automatic Acquisition of Hyponyms from Large Text Corpora[C]//Proceedings of the 14th International Conference on Computational Linguistics.1992: 539-545.
(Gu Jun, Yan Ming, Wang Hao.Research on Ontology Relation Acquisition Based on Improved Association Rules[J]. Information Studies: Theory & Application, 2011, 34(12): 121-125. )
[28]
Rada R, Mili H, Bicknell E, et al.Development and Application of a Metric on Semantic Nets[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(1): 17-30.
doi: 10.1109/21.24528
[29]
Richardson R, Smeaton A, Murphy J.Using WordNet as a Knowledge Base for Measuring Semantic Similarity Between Words[R]. Technical Report Working Paper CA-1294, School of Computer Applications, Dublin City University, 1994.
[30]
Lord P W, Stevens R D, Brass A, et al.Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation[J]. Bioinformatics, 2003, 19(10): 1275-1283.
doi: 10.1093/bioinformatics/btg153
pmid: 12835272
[31]
Resnik O.Semantic Similarity in a Taxonomy: An Information-Based Measure and Its Application to Problems of Ambiguity and Natural Language[J]. Journal of Artificial Intelligence Research, 1999(11): 95-130.
doi: 10.1613/jair.514
[32]
Knappe R, Bulskov H, Andreasen T.On Similarity Measures for Content-based Querying[C]//Proceedings of the 10th International Fuzzy Systems Association World Congress. 2003: 400-403.
(Hu Changping, Chen Guo.A New Feature Selection Method Based on Term Contribution in Co-word Analysis[J]. New Technology of Library and Information Service, 2013(7): 89-93.)
[34]
39疾病百科-心血管内科疾病[EB/OL]. [2016-10-10]. .
[34]
(39 Wiki of Diseases-Cardiovascular Diseases [EB/OL]. [2016-10-10].
[35]
39疾病百科-高血压疾病知识[EB/OL]. [2016-10-10]. .
[35]
(39 Wiki of Diseases-Hypertension [EB/OL]. [2016-10-10].
[36]
NLPIR汉语分词系统[EB/OL]. [2016-05-10]. .
[36]
(The NLPIR Chinese Word Segmentation System [EB/OL]. [2016-05-10].