Please wait a minute...
Advanced Search
现代图书情报技术  2014, Vol. 30 Issue (3): 88-95     https://doi.org/10.11925/infotech.1003-3513.2014.03.13
  情报分析与研究 本期目录 | 过刊浏览 | 高级检索 |
提取核心特征词的惩罚性矩阵分解方法——以共词分析为例
俞仙子1, 高英莲2, 马春霞1, 刘金星1
1 曲阜师范大学信息技术与传播学院 日照 276826;
2 曲阜师范大学图书馆 日照 276826
The Penalized Matrix Decomposition Method of Extracting Core Characteristic Words——Taking Co-word Analysis as an Example
Yu Xianzi1, Gao Yinglian2, Ma Chunxia1, Liu Jinxing1
1 Department of Information Technology and Communication, QuFu Normal University, Rizhao 276826, China;
2 Library of QuFu Normal University, Rizhao 276826, China
全文: PDF (637 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的] 在共词分析时对高维共词矩阵进行稀疏降维,直观快速地凸显出高维矩阵中的核心特征词。[方法] 提出基于惩罚性矩阵分解(PMD)的文本核心特征词提取方法,选取有关高校图书馆使用社交网络这一主题的文献进行实验,用Matlab R2012a对构建的共词矩阵进行PMD分解降维。[结果] 利用PMD从1 648个特征词中提取出65个核心特征词,不仅大于用主成分分析提取的34个特征词,而且揭示出高校图书馆使用社交网络的研究热点。[局限] 实验中提取的高校图书馆使用社交网络的特征词未能全面涉及,有一定的主观性。[结论] 用PMD方法对高维共词矩阵进行稀疏后,所获核心特征词更容易被理解和解释,也能够表明一些边缘化的主题。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
高英莲
刘金星
马春霞
俞仙子
关键词 惩罚性矩阵分析特征词提取主成分分析    
Abstract

[Objective] Highlight core characteristic words directly by reducing the high-dimensional co-matrix sparely in co-word analysis. [Methods] This article proposes, based on the Penalized Matrix Decomposition (PMD) method, a method to extract core characteristic words from texts of characteristic words.The authors experiment on articles which are related to university libraries that take advantage of SNS, and use Matlab R2012a to decompose high-dimensional co-word matrix by PMD. [Results] By using PMD method, 65 core characteristic words are extracted from all 1648 characteristic words, which more than 34 characteristic words that extracted by the principal components analysis, and also reveal research hotspots of the university libraries using social networks. [Limitations] The authors don't refer to all the characteristic words that acquired from literature, and have a certain subjectivity. [Conclusions] Converting into sparse matrix by PMD, core characteristic words are comprehended and explained more easily, meanwhile, they can show some marginal subjects.

Key wordsPMD    Extracting core characteristic words    PCA
收稿日期: 2013-09-10      出版日期: 2014-04-15
:  G250  
基金资助:

本文系曲阜师范大学校级基金项目“多变量控制的先进建模方法研究”(项目编号:XJ200947)的研究成果之一。

通讯作者: 俞仙子 E-mail:yuxianzi2010@163.com     E-mail: yuxianzi2010@163.com
作者简介: 作者贡献声明:俞仙子: 采集、清洗、分析数据和论文起草;高英莲: 数据的分析与论文修订;马春霞: 实验调试;刘金星: 提出研究思路,设计研究方案和论文修订。
引用本文:   
俞仙子, 高英莲, 马春霞, 刘金星. 提取核心特征词的惩罚性矩阵分解方法——以共词分析为例[J]. 现代图书情报技术, 2014, 30(3): 88-95.
Yu Xianzi, Gao Yinglian, Ma Chunxia, Liu Jinxing. The Penalized Matrix Decomposition Method of Extracting Core Characteristic Words——Taking Co-word Analysis as an Example. New Technology of Library and Information Service, 2014, 30(3): 88-95.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2014.03.13      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2014/V30/I3/88

[1] 李颖, 贾二鹏, 马力. 国内外共词分析研究综述[J]. 新世纪图书馆, 2012(1): 23-27. (Li Ying, Jia Erpeng, Ma Li. A Review of Domestic and International Co-word Analysis[J]. New Century Library, 2012(1): 23-27.)

[2] 张勤, 马费成. 国外知识管理研究范式——以共词分析为方法[J]. 管理科学学报, 2007, 10(6): 65-75. (Zhang Qin, Ma Feicheng. On Paradigm of Research Knowledge Manage- ment:A Bibliometric Analysis [J]. Journal of Management Sciences in China, 2007, 10(6): 65-75.)

[3] 陆宇杰, 张凤仙, 范并思. 基于共词分析的高校图书馆核心价值研究[J]. 大学图书馆学报, 2011, 29(6): 34-40. (Lu Yujie, Zhang Fengxian, Fan Bingsi. Research on the Core Value of Foreign Universities——Based on Co-word Analysis[J]. Journal of Academic Libraries, 2011, 29(6): 34-40.)

[4] Ding Y, Chowdhury G G, Foo S. Bibliometric Cartography of Information Retrieval Research by Using Co-word Analysis[J]. Information Processing & Management, 2001, 37(6): 817-842.

[5] Morris S A. Manifestation of Emerging Specialties in Journal Literature:A Growth Model of Papers, References, Exemplars, Bibliographic Coupling, Cocitation, and Clustering Coefficient Distribution[J]. Journal of the American Society for Information Science and Technology, 2005, 56(12): 1250-1273.

[6] 李纲, 李轶. 一种基于关键词加权的共词分析方法[J]. 情报科学, 2011, 29(3): 321-324. (Li Gang, Li Yi. An Approach to Co-word Analysis Based on Weighted Keywords[J]. Information Science, 2011, 29(3): 321-324.)

[7] 杨彦荣, 张阳. 加权共词分析法研究[J]. 情报理论与实践, 2011, 34(4): 61-63. (Yang Yanrong, Zhang Yang. Research on Weighted Co-word Analysis[J]. Information Studies:Theory & Application, 2011, 34(4): 61-63.)

[8] Witten D M, Tibshirani R, Hastie T. A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis[J]. Biostatistics, 2009, 10(3): 515-534.

[9] Zheng C H, Zhang L, Ng T Y, et al. Inferring the Transcriptional Modules Using Penalized Matrix Decomposition[C]. In:Proceedings of the 6th International Conference on Intelligent Computing, Changsha, China. 2010: 35-41.

[10] Zhang J, Zheng C H, Liu J X, et al. Discovering the Transcriptional Modules Using Microarray Data by Penalized Matrix Decomposition[J]. Computers in Biology and Medicine, 2011, 41(11): 1041-1050.

[11] Liu J X, Zheng C H, Xu Y. Extracting Plants Core Genes Responding to Abiotic Stresses by Penalized Matrix Decomposition[J]. Computers in Biology and Medicine, 2012, 42(5): 582-589.

[12] 王娟, 范少萍, 郑春厚. 基于惩罚性矩阵分解的文本聚类分析[J]. 情报学报, 2012, 31(9): 998-1008. (Wang Juan, Fan Shaoping, Zheng Chunhou. Analysis of Text Clustering Based on Penalized Matrix Decomposition[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(9): 998-1008.)

[13] 郭春侠, 叶继元. 基于共词分析的国外图书情报学研究热点[J]. 图书情报工作, 2011, 55(20): 19-22. (Guo Chunxia, Ye Jiyuan. Hot Topics of Library and Information Science Abroad Between 2005 and 2009 Based on Co-word Analysis Method[J] Library and Information Service, 2011, 55(20): 19-22.)

[14] Pearson K. On Lines and Planes of Closest Fit to Systems of Points in Space[J]. Philosophical Magazine, 1901, 2 (6): 559-572.

[15] Abdi H, Williams L J. Principal Component Analysis[J]. Wiley Interdisciplinary Reviews:Computational Statistics, 2010, 2(4):433-459.

[16] 孙晓宁, 储节旺. 近十年知识管理领域硕博士学位论文研究热点分析——以共词分析为方法[J]. 情报杂志, 2012, 31(6): 433-459. (Sun Xiaoning, Chu Jiewang. On Hotspots of Master and Ph. D. Degree's Dissertations in the Field of Knowledge Management During the Last Decade:A Co-word Analysis[J]. Journal of Intelligence, 2012, 31(6): 433-459.)

[1] 林克柔,王昊,龚丽娟,张宝隆. 融合多特征的中文论文同名学者消歧研究 *[J]. 数据分析与知识发现, 2021, 5(4): 90-102.
[2] 刘伟江,魏海,运天鹤. 基于卷积神经网络的客户信用评估模型研究*[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[3] 陈远, 王超群, 胡忠义, 吴江. 基于主成分分析和随机森林的恶意网站评估与识别*[J]. 数据分析与知识发现, 2018, 2(4): 71-80.
[4] 张李义, 张皎. 一种基于主成分分析和随机森林的刷客识别方法[J]. 现代图书情报技术, 2015, 31(10): 65-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn