Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (5): 30-37    DOI: 10.11925/infotech.1003-3513.2016.05.04
Orginal Article Current Issue | Archive | Adv Search |
Clustering and Discovering Web Services with Topic Model
Li Hui,Hu Yunfeng()
School of Economics and Management, Xidian University, Xi’an 710071, China
Download: PDF(612 KB)   HTML ( 52
Export: BibTeX | EndNote (RIS)      

[Objective] We propose an effective method to cluster and discover the needed Web services. [Methods] First, we employed the Biterm Topic Model to learn the latent topics of the Web service description corpus. Second, we retrieved and clustered each document’s topic distribution. Finally, we created a mechanism to discover Web service quickly. [Results] The proposed method achieved better precision rate and normalized discounted cumulative gain than methods using Latent Dirichlet Allocation and external corpus. [Limitations] Only considered functions of the Web services, and did not include the quality factors to the algorithm. [Conclusions] The proposed method could identify the needed services more accurately.

Key wordsWeb service      Topic model      Clustering      Discovery     
Received: 22 December 2015      Published: 24 June 2016

Cite this article:

Li Hui,Hu Yunfeng. Clustering and Discovering Web Services with Topic Model. New Technology of Library and Information Service, 2016, 32(5): 30-37.

URL:     OR

[1] Farrag T A, Saleh A I, Ali H A.Semantic Web Services Matchmaking: Semantic Distance-based Approach[J]. Computer and Electrical Engineering, 2013, 39(2): 497-511.
[2] Lu G, Wang T, Zhang G, et al.Semantic Web Services Discovery Based on Domain Ontology [C]. In: Proceedings of the 2012 World Automation Congress (WAC). 2012: 1-4.
[3] 石敏, 赵文栋, 张磊. 一种基于本体划分的语义Web服务发现算法[J]. 计算机工程, 2014, 40(2): 175-179.
[3] (Shi Min, Zhao Wendong, Zhang Lei.A Semantic Web Service Discovery Algorithm Based on Ontology Partition[J]. Computer Engineering, 2014, 40(2): 175-179.)
[4] Atkinson C, Bostan P, Hummel O, et al.A Practical Approach to Web Service Discovery and Retrieval[C]. In: Proceedings of the 2007 IEEE International Conference on Web Service. 2007: 241-248.
[5] Yan X, Guo J, Lan Y, et al.A Biterm Topic Model for Short Texts [C]. In: Proceedings of the 22nd International World Wide Web Conferences. 2013: 1445-1456.
[6] Qu M, Liu S, Bao T.On the Trusted Ontology Model for Evaluating the Semantic Web Services[C]. In: Proceedings of the 14th International Conference on Computer Supported Cooperative Work in Design.2010: 368-369.
[7] Kopecky J, Vitvar T, Bournez C, et al.Semantic Annotations for WSDL and XML Schema[J]. IEEE Internet Computing, 2007, 11(6): 60-67.
[8] 杨惠荣, 刘珊珊, 尹宝才, 等. 基于语义距离的 Web 服务匹配算法[J]. 北京工业大学学报, 2011, 37(4): 591-595.
[8] (Yang Huirong, Liu Shanshan, Yin Baocai, et al.Matching Algorithm of Services Based on Semantic Distance[J]. Journal of Beijing University of Technology, 2011, 37(4): 591-595.)
[9] Abramowicz W, Haniewicz K, Kaczmarek M, et al.Architecture for Web Services Filtering and Clustering [C]. In: Proceedings of the 2nd International Conference on Internet and Web Applications and Services.2007.
[10] Nayak R, Lee B.Web Service Discovery with Additional Semantics and Clustering [C]. In: Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence. 2007: 555-558.
[11] Cassar G, Barnaghi P, Moessner K.Probabilistic Methods for Service Clustering [J]. In: Proceeding of the 4th International Workshop on Service Matchmaking & Resource Retrieval. 2010.
[12] Blei D M, Ng A Y, Jordan M I.Latent DirichletAllocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[13] Aznag M, Quafafou M, Rochd E M, et al.Probabilistic Topic Models for Web Services Clustering and Discovery[A]. // Service-Oriented and Cloud Computing[M]. Springer-Verlag Berlin Heidelberg, 2013.
[14] Blei D M, Lafferty J D.Correlated Topic Models[C]. In: Proceedings of the 23rd International Conference on Machine Learning. 2005.
[15] 魏强, 金芝, 许焱. 基于概率主题模型的物联网服务发现[J]. 软件学报, 2014, 25(8): 1640-1658.
[15] (Wei Qiang, Jin Zhi, Xu Yan.Service Discovery for Internet of Things Based on Probabilistic Topic Model[J]. Journal of Software, 2014, 25(8): 1640-1658.)
[16] Zhu Y, Li L, Luo L.Learning to Classify Short Text with Topic Model and External Knowledge[A]. //Knowledge Science, Engineering and Management[M]. Springer Berlin Heidelberg, 2013.
[17] Duda R O, Hart P E, Stork D G.模式分类[M]. 李宏东, 姚天翔等译. 第2版. 机械工业出版社, 2003.
[17] (Duda R O, Hart P E, Stork DG.Pattern Classification [M]. Translated by Li Hongdong, Yao Tianxiang, et al. The 2nd Edition. China Machine Press, 2003.)
[18] Lin J.Divergence Measures Based on the Shannon Entropy[J]. IEEE Transactions on Information Theory, 1991, 37(1): 145-151.
[19] Zhang Y L, Zheng Z B, Lyu M R.A QoS-aware Search Engine for Web Services [C]. In: Proceedings of the 8th International Conference on Web Services. Miami, Florida, USA. 2010.
[20] Cover T M, Hart P E.Nearest Neighbor Pattern Classification[J]. IEEE Transactions on Information Theory, 1967, 13(1): 21-27.
[1] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[2] Ke Li,Yuya Sasaki. Analyzing Sentiment Distribution with Spatial-textual Data of Multi-dimensional Clustering[J]. 数据分析与知识发现, 2019, 3(7): 14-22.
[3] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[4] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[5] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[6] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[7] Jiang Wu,Yinghui Zhao,Jiahui Gao. Research on Weibo Opinion Leaders Identification and Analysis in Medical Public Opinion Incidents[J]. 数据分析与知识发现, 2019, 3(4): 53-62.
[8] Lianjie Xiao,Mengrui Gao,Xinning Su. An Under-sampling Ensemble Classification Algorithm Based on Fuzzy C-Means Clustering for Imbalanced Data[J]. 数据分析与知识发现, 2019, 3(4): 90-96.
[9] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[10] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[11] Juhua Wu,Yu Wang,Ming Li,Shaoyun Cai. Knowledge Discovery of Online Health Communities with Weighted Knowledge Network[J]. 数据分析与知识发现, 2019, 3(2): 108-117.
[12] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[13] Jiaxin Ye,Huixiang Xiong. Recommending Personalized Contents from Cross-Domain Resources Based on Tags[J]. 数据分析与知识发现, 2019, 3(2): 21-32.
[14] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[15] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938