Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (2/3): 151-166    DOI: 10.11925/infotech.2096-3467.2021.0947
Current Issue | Archive | Adv Search |
An Analysis Framework for Job Demands from Job Postings
Yue Tieqi,Fu Youfei,Xu Jian()
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
Download: PDF (3748 KB)   HTML ( 35
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a complete and systematic framework to analyze qualifications from online job postings. It then examines the requirements of Internet-related jobs with the framework. [Methods] First, we retrieved recruitment advertisements for the Internet industry. Then, we constructed an LDA model for topic mining and classification of job descriptions. Finally, we used the Word2Vec model and dependency syntax analysis to obtain the topic-word and degree-word lists to construct the topic ontology. [Results] The empirical analysis revealed the status quo of the Internet industry positions, such as the regional and category distributions, as well as the required qualification for different types of positions. [Limitations] There were few data samples for campus recruitment, which led to deviations between the analysis results and the actual situation. The word-segmentation is not perfect for the LDA model, and some topics were not representative. [Conclusions] The proposed framework could effectively analyze job postings.

Key wordsRecruitment Advertisement      Job Demand Analysis      LDA Topic Model      Ontology     
Received: 31 August 2021      Published: 14 April 2022
ZTFLH:  TP274  
Fund:Undergraduate Teaching Quality Project of Sun Yat-Sen University(20000-31911130)
Corresponding Authors: Xu Jian,ORCID: 0000-0003-4886-4708     E-mail:

Cite this article:

Yue Tieqi, Fu Youfei, Xu Jian. An Analysis Framework for Job Demands from Job Postings. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 151-166.

URL:     OR

Job Demand Analysis Framework
The Subject Consistency Score for Different Topic Numbers
The Theme Model Visualization When the Number of Themes is 6
主题 主题分项 主题词
主题2:个人素质能力 精神素质 创新能力
办事能力 执行能力
Subject Word for Personal Quality Competence
主题 主题分项 主题词 一般程度词 较强程度词 强程度词
主题1:业务技能要求 市场运营 运营 了解(6)、理解(1)、懂(1) 熟悉(16)、做过(2)、喜欢(2) 热爱(16)
推广 了解(7) 熟悉(14)、做好(3)、掌握(1)、喜欢(1) 热爱(2)、精通(1)
调研 了解(1)
销售与客户管理 产品销售
客户关系 做好(8)
Subject-Degree Word List (Partial)
Computer Technology Theme Ontology
The Proportion of the Top 5 Provinces in the Positions in Two Time Periods
The Percentage of Job Types for the Two Time Periods
Job Demands for Each Topic of the Word Frequency Ratio in Two Time Periods
2015年11月-2016年4月 2019年10月-2019年11月
推广 7.06% 本科 9.22%
运营 6.71% 运营 7.74%
责任感 6.45% 学习能力 5.13%
沟通能力 5.52% 沟通能力 5.03%
学习能力 4.27% 算法 4.63%
大专 3.69% 责任感 4.36%
团队合作精神 3.63% Python 3.69%
本科 2.97% C++ 3.49%
收集 2.83% 数据分析 3.33%
执行能力 2.23% Java 2.97%
The Two Time Periods Account for the Top10 Topic Words
岗位分类 主题1:
技术类 0.340 0.888 1.970 0.770 0.537 0.936
运营类 3.065 1.115 0.045 1.010 2.092 0.842
市场与销售类 1.670 1.142 0.032 0.978 0.801 0.756
职能类 0.764 1.201 0.016 0.762 0.662 1.038
设计类 0.278 0.705 0.081 2.446 1.284 2.002
产品类 1.852 1.386 0.108 1.066 2.710 0.877
金融类 0.784 1.189 0.187 1.720 0.502 1.028
Relevance Between Topics and Positions from 2019.10 to 2019.11
排名 2015年11月-2016年4月 2019年10月-2019年11月
主题词节点 点度中心度 主题词节点 点度中心度
1 责任感 103 本科 87
2 沟通能力 99 团队合作精神 81
3 团队合作精神 97 C++ 81
4 学习能力 96 Python 80
5 Javascript 94 学习能力 78
6 本科 91 Java 77
7 HTML 90 沟通能力 76
8 数据库 89 责任感 75
9 CSS 88 算法 74
10 运营 87 运营 67
Point-centric Top10 Topic Word Nodes in the Two Time Periods
Co-occurrence Network Diagram of Subject Words for Technical Positions
群组1 群组2 群组3
主题词节点 点度中心度 主题词节点 点度中心度 主题词节点 点度中心度
责任感 103 沟通能力 99 数据库 89
学习能力 96 团队合作精神 97 运营 87
本科 91 Javascript 94 Java 85
Android 82 HTML 90 Linux 83
表达能力 81 CSS 88 Python 77
移动互联网 81 Jquery 87 操作系统 70
协调能力 80 Ajax 83 调研 69
分析能力 79 大专 80 数学 60
算法 78 执行能力 70 通信 58
C++ 78 产品设计 69 数据分析 56
Point-centric Top 10 Topic Words for Each Group for Technical Positions from 2015.11 to 2016.4
群组1 群组2 群组3
主题词节点 点度中心度 主题词节点 点度中心度 主题词节点 点度中心度
Javascript 60 C++ 81 本科 87
HTML 54 Python 80 团队合作精神 81
CSS 47 Java 77 学习能力 78
Ajax 34 算法 74 沟通能力 76
Jquery 25 数学 66 责任感 75
XHTML 22 Linux 60 运营 67
交互设计 21 硕士 59 数据库 63
dom 20 计算机专业 57 通信 63
Flash 14 软件工程 56 表达能力 55
机器学习 55 求知欲 52
Point-centric Top 10 Topic Words for Each Group for Technical Positions from 2019.10 to 2019.11
公司 岗位 岗位描述
用友网络 软件测试工程师 1、负责产品的日常测试工作,用自动化工具进行脚本录制、调试及回放;
网易 内容运营 1、协同相关的业务链条,如市场,运营等,探索从内容维度辅助业务圈粉和品牌力的推广;
The Recruitment Text
公司和岗位 主题词或得分项 命中的程度词或得分主题词 主题词或得分项最终得分
用友网络:软件测试工程师 学习能力 ['强'] 2
团队合作精神 ['具有'] 1
责任感 ['有'] 1
Python ['掌握'] 2
测试用例 1
测试报告 1
学历 ['本科'] 2
沟通能力 ['强', '有'] 2
网易:内容运营 学习能力 ['优秀', '具有'] 3
协调能力 ['有'] 1
学历 ['本科'] 2
运营 1
推广 1
数据分析 1
新媒体 1
Subject Words and Scoring Items
Job Demands Topic Scoring Radar Chart
[1] Papoutsoglou M, Mittas N, Angelis L. Mining People Analytics from StackOverflow Job Advertisements[C]// Proceedings of the 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 2017: 108-115.
[2] Surakka S. Analysis of Technical Skills in Job Advertisements Targeted at Software Developers[J]. Informatics in Education, 2005, 4(1):102-122.
doi: 10.15388/infedu.2005.07
[3] 彩广畏. 从网络招聘信息看我国人才需求状况[D]. 长沙: 湖南师范大学, 2017.
[3] ( Cai Guangwei. The Situation of Talent Demand in China from the Viewpoint of Network Recruitment Information[D]. Changsha: Hunan Normal University, 2017.)
[4] 胡忠义, 李雅, 吴江, 等. 基于招聘信息的商务智能人才需求分析与启示[J]. 信息资源管理学报, 2019, 9(3):111-118.
[4] ( Hu Zhongyi, Li Ya, Wu Jiang, et al. Analysis of Recruitment Information on Business Intelligence Professionals: Recruitment Requirement and Enlightenment[J]. Journal of Information Resources Management, 2019, 9(3):111-118.)
[5] 李尚昊, 郝琦. 内容分析与文本挖掘在信息分析应用中的比较研究[J]. 图书馆学研究, 2015(23):37-42.
[5] ( Li Shanghao, Hao Qi. A Comparative Study of Content Analysis and Text Mining in the Application of Information Analysis[J]. Research on Library Science, 2015(23):37-42.)
[6] Todd P A, McKeen J D, Gallupe R B. The Evolution of IS Job Skills: A Content Analysis of IS Job Advertisements from 1970 to 1990[J]. MIS Quarterly, 1995, 19(1):1-27.
doi: 10.2307/249709
[7] Yadav A K S, Bankar P D. Employment Opportunities in LIS Field in India: A Content Analysis of Positions Advertised[J]. Annals of Library and Information Studies, 2016, 63(1):53-58.
[8] Xu T, Zhu H S, Zhu C, et al. Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach[OL]. arXiv Preprint, arXiv:1712.03087.
[9] 陈媛媛, 董伟. 社会需求导向下图书情报专业毕业生就业技能分析[J]. 图书情报工作, 2017, 61(19):66-73.
[9] ( Chen Yuanyuan, Dong Wei. Analysis on the Employment Skills of Library and Information Science Graduates Under the Guidance of Social Needs[J]. Library and Information Service, 2017, 61(19):66-73.)
[10] 赵丹. 网络招聘信息的分析与挖掘[D]. 贵阳: 贵州财经大学, 2017.
[10] ( Zhao Dan. Analysis and Mining of Network Recruitment Information[D]. Guiyang: Guizhou University of Finance and Economics, 2017.)
[11] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[OL]. arXiv Preprint, arXiv:1310.4546.
[12] 第四届“泰迪杯”全国数据挖掘挑战赛赛题[EB/OL].(2016-03-29.
[12] (Questions of the 4th “Teddy Cup” National Data Mining Challenge[EB/OL].(2016-03-29. )
[13] 八爪鱼采集器[EB/OL].(2021-07-16).
[13] (Bazhuayu Crawler. (2021-07-16). )
[14] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.
[15] 刘畅. 数据类岗位招聘需求信息研究[D]. 兰州: 兰州财经大学, 2019.
[15] ( Liu Chang. Research on Recruitment Demand Information of Data Job[D]. Lanzhou: Lanzhou University of Finance and Economics, 2019.)
[16] 李轩. 基于知识图谱的教育领域知识问答系统的研究与应用[D]. 长春: 吉林大学, 2019.
[16] ( Li Xuan. Research and Application of Knowledge Question Answering System in Education Based on Knowledge Graph[D]. Changchun: Jilin University, 2019.)
[17] 张文秀, 朱庆华. 领域本体的构建方法研究[J]. 图书与情报, 2011(1):16-19.
[17] ( Zhang Wenxiu, Zhu Qinghua. Research on Construction Methods of Domain Ontology[J]. Library and Information, 2011(1):16-19.)
[18] 张俊峰. 国内网站招聘岗位需求特征挖掘及其应用研究[D]. 蚌埠: 安徽财经大学, 2017.
[18] ( Zhang Junfeng. Research on Demand Characteristics Mining and Application of Domestic Website Recruitment[D]. Bengbu: Anhui University of Finance & Economics, 2017.)
[19] 廖君华, 陈军营, 白如江. 基于引文内容挖掘的科技创新路径识别方法与开源工具研究[J]. 现代情报, 2018, 38(7):113-121.
[19] ( Liao Junhua, Chen Junying, Bai Rujiang. Research on Technology Innovation Path Recognition Method and Open Source Tool Based on Citation Content Mining[J]. Journal of Modern Information, 2018, 38(7):113-121.)
[20] Jieba分词[EB/OL].(2021-07-16).
[20] (Jieba Segmentation. (2021-07-16). )
[1] Sheng Shu, Huang Qi, Yang Yang, Xie Qiwen, Qin Xinguo. Exchanging Chinese Medical Information Based on HL7 FHIR[J]. 数据分析与知识发现, 2021, 5(11): 13-28.
[2] Zeng Zhen,Li Gang,Mao Jin,Chen Jinghao. Data Governance and Domain Ontology of Regional Public Security[J]. 数据分析与知识发现, 2020, 4(9): 41-55.
[3] Shaohua Qiang,Yunlu Luo,Yupeng Li,Peng Wu. Ontology Reasoning for Financial Affairs with RBR and CBR[J]. 数据分析与知识发现, 2019, 3(8): 94-104.
[4] Shiqi Deng,Liang Hong. Constructing Domain Ontology for Intelligent Applications: Case Study of Anti Tele-Fraud[J]. 数据分析与知识发现, 2019, 3(7): 73-84.
[5] Zhu Fu,Yuefen Wang,Xuhui Ding. Semantic Representation of Design Process Knowledge Reuse[J]. 数据分析与知识发现, 2019, 3(6): 21-29.
[6] Guangshang Gao. A Survey of User Profiles Methods[J]. 数据分析与知识发现, 2019, 3(3): 25-35.
[7] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[8] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[9] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[10] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[11] He Youshi,He Shufang. Sentiment Mining of Online Product Reviews Based on Domain Ontology[J]. 数据分析与知识发现, 2018, 2(8): 60-68.
[12] Tang Huihui,Wang Hao,Zhang Zixuan,Wang Xueying. Extracting Names of Historical Events Based on Chinese Character Tags[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[13] Pang Beibei,Gou Juanqiong,Mu Wenxin. Extracting Topics and Their Relationship from College Student Mentoring[J]. 数据分析与知识发现, 2018, 2(6): 92-101.
[14] Ding Shengchun,Liu Menglu,Fu Zhu. Unified Multidimensional Model Based on Knowledge Flow in Conceptual Design[J]. 数据分析与知识发现, 2018, 2(2): 11-19.
[15] Li He,Zhu Linlin,Yan Min,Liu Jincheng,Hong Chuang. Identifying Useful Information from Open Innovation Community[J]. 数据分析与知识发现, 2018, 2(12): 12-22.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938