Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (10): 144-155    DOI: 10.11925/infotech.2096-3467.2022.1262
Current Issue | Archive | Adv Search |
Constructing Multimodal Corpus of Chinese Vocabulary for Sign Language Linguistics
Zhang Yanqiong1,2,3(),Zhu Zhaosong1,2,Zhao Xiaochi2
1School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, China
2Braille and Sign Language Research Center, Nanjing Normal University of Special Education, Nanjing 210038, China
3Jiangsu Provincial Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
Download: PDF (3640 KB)   HTML ( 8
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper extracts and organizes knowledge from multimodal sign language resources and constructs a corpus for related research. It meets the public’s urgent demands to obtain sign language knowledge. [Context] The new multimodal corpus is suitable for mining sign language knowledge, which addresses low information levels, disordered resource organization, and difficult utilization of sign language knowledge. [Methods] Firstly, we constructed the multi-modal feature annotation system for sign language vocabulary. Secondly, we formulated the feature coding scheme of the vocabulary and implemented multi-level annotation. Finally, we established the graph model for sign language vocabulary and the Neo4j database to store and visualize. [Results] The vocabulary data are from the national sign language vocabulary corpus. Over 10 000 sign language vocabulary multimodal annotation has been completed, and we realized the whole process of constructing a multimodal corpus. [Conclusions] The new corpus increases knowledge retrieval of hand shape, movement, expression, and posture, which greatly improves the usability of the sign language corpus.

Key wordsChinese Sign Language      Vocabulary      Corpus      Knowledge Organization      Multimodal     
Received: 27 November 2022      Published: 30 March 2023
ZTFLH:  TP391  
  G250  
Fund:National Social Sciences Fund of China(20BTQ065)
Corresponding Authors: Zhang Yanqiong,ORCID:0000-0003-4372-1003,E-mail: zhangyanqiong@njts.edu.cn。   

Cite this article:

Zhang Yanqiong, Zhu Zhaosong, Zhao Xiaochi. Constructing Multimodal Corpus of Chinese Vocabulary for Sign Language Linguistics. Data Analysis and Knowledge Discovery, 2023, 7(10): 144-155.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1262     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I10/144

Construction Framework of Multimodal Corpus of Chinese Sign Language Vocabulary
Description Scheme of Phonological Features of Hong Kong Sign Language Vocabulary
Multimodal Feature Representation Model of Sign Language Vocabulary
Sign Language Multimodal Feature Labeling Content
Sign Language Type
Location Code
Sign Word “Painter” and Multimodal Information Annotation
实体1 关系类型 实体2
手语词 手势分割 基本手势
基本手势 主手手形 手形
辅手手形
手形改变
主手选部 手的选部
辅手选部
主手方向 方向
辅手方向
方向改变
发音位置 位置
位置改变
非手控特征 非手控信息
路径运动 路径动作
局部运动 局部动作
路径动作 时序关系 路径动作
局部动作
Entity and Entity Relationship in the Multimodal Corpus of Sign Language Vocabulary
Example of Sign Language Annotation
Sign Word “Teacher”
Visualization of Sign Language Vocabulary Annotation Information Based on Neo4j
Example of Handshape Search (No. DEZ101)
Example of Entity Relationship
[1] 向安玲, 高爽, 彭影彤, 等. 知识重组与场景再构:面向数字资源管理的元宇宙[J]. 图书情报知识, 2022, 39(1): 30-38.
[1] (Xiang Anling, Gao Shuang, Peng Yingtong, et al. Knowledge Reorganization and Scene Reconstruction: A Metaverse for Digital Resources Management[J]. Documentation, Information & Knowledge, 2022, 39(1): 30-38.)
[2] 张浩宇, 王天保, 李孟择, 等. 视觉语言多模态预训练综述[J]. 中国图象图形学报, 2022, 27(9): 2652-2682.
[2] (Zhang Haoyu, Wang Tianbao, Li Mengze, et al. Comprehensive Review of Visual-Language-Oriented Multimodal Pre-Training Methods[J]. Journal of Image and Graphics, 2022, 27(9): 2652-2682.)
[3] 姚登峰. 手语计算概论[M]. 北京: 科学出版社, 2022.
[3] (Yao Dengfeng. A Guide to Sign Language Computing[M]. Beijing: Science Press, 2022.)
[4] 邱云峰, 姚登峰, 李荣, 等. 中国手语语言学概论[M]. 北京: 中国国际广播出版社, 2018.
[4] (Qiu Yunfeng, Yao Dengfeng, Li Rong, et al. Introduction to Chinese Sign Language Linguistics[M]. Beijing: China International Broadcasting Press, 2018.)
[5] 中国残疾人联合会. 《第二期国家手语和盲文规范化行动计划(2021-2025年)》[EB/OL](2021-11-29)[2022-11-26].https://www.cdpf.org.cn//zwgk/zcwj/wjfb/fe1a8761eb2d40bc9467179bdac0b551.htm.
[5] (China Disabled Persons’ Federation. Second Phase National Action Plan for Standardization of Sign Language and Braille (2021-2025)[EB/OL](2021-11-29)[2022-11-26]. https://www.cdpf.org.cn//zwgk/zcwj/wjfb/fe1a8761eb2d40bc9467179bdac0b551.htm.)
[6] 赵晓驰, 任媛媛, 丁勇. 国家手语词汇语料库的建设与使用[J]. 中国特殊教育, 2017(1): 43-47.
[6] (Zhao Xiaochi, Ren Yuanyuan, Ding Yong. On the Construction and Application of China’s Sign Language Vocabulary Corpus[J]. Chinese Journal of Special Education, 2017(1): 43-47.)
[7] 张帜. Neo4j权威指南[M]. 北京: 清华大学出版社, 2017.
[7] (Zhang Zhi. Neo4j Authoritative Guide[M]. Beijing: Tsinghua University Press, 2017.)
[8] Lucas C, Bayley R. Variation in ASL: The Role of Grammatical Function[J]. Sign Language Studies, 2005, 6(1): 38-75.
doi: 10.1353/sls.2006.0005
[9] Johnston T A, Schembri A. Australian Sign Language (Auslan): An Introduction to Sign Language Linguistics[M]. Cambridge, UK: Cambridge University Press, 2007.
[10] Auslan Signbank[DB/OL]. [2022-11-26]. http://www.auslan.org.au/.
[11] Caselli N K, Sehyr Z S, Cohen-Goldberg A M, et al. ASL-LEX: A Lexical Database of American Sign Language[J]. Behavior Research Methods, 2017, 49(2): 784-801.
doi: 10.3758/s13428-016-0742-0 pmid: 27193158
[12] Sehyr Z S, Caselli N, Cohen-Goldberg A M, et al. The ASL-LEX 2.0 Project: A Database of Lexical and Phonological Properties for 2,723 Signs in American Sign Language[J]. The Journal of Deaf Studies and Deaf Education, 2021, 26(2): 263-277.
doi: 10.1093/deafed/enaa038
[13] Fenlon J, Cormier K, Rentelis R, et al.BSL Signbank: A Lexical Database of British Sign Language[DB/OL]. [2022-11-26]. http://bslsignbank.ucl.ac.uk.
[14] Schembri A, Fenlon J, Rentelis R, et al. British Sign Language Corpus Project: A Corpus of Digital Video Data and Annotations of British Sign Language[DB/OL]. [2022-11-26]. http://www.bslcorpusproject.org.
[15] Fenlon J, Cormier K, Schembri A. Building BSL SignBank: The Lemma Dilemma Revisited[J]. International Journal of Lexicography, 2015, 28(2): 169-206.
doi: 10.1093/ijl/ecv008
[16] NGT Corpus[DB/OL]. [2022-11-26]. http://www.ru.nl/corpusngt/.
[17] 全国哲学社会科学工作办公室. 基于汉语和部分少数民族语言的手语语料库建设研究[R/OL]. [2022-11-26]. http://www.nopss.gov.cn/GB/352519/355466/.
[17] (National Office for Philosophy and Social Sciences. Sign Language Corpus Research Based on Chinese and Some Minority Languages[R/OL]. [2022-11-26]. http://www.nopss.gov.cn/GB/352519/355466/.)
[18] 陈晓燕. 中国电视手语传译中的非手部策略——基于中国手语嘴部动作的研究[D]. 厦门: 厦门大学, 2014.
[18] (Chen Xiaoyan. Strategies of Nonmanuals in Sign Language Interpreting on Chinese TV[D]. Xiamen: Xiamen University, 2014.)
[19] 吴蕊珠, 李晗静, 吕会华, 等. 面向ELAN软件的手语汉语平行语料库构建[J]. 中文信息学报, 2019, 33(2): 43-50.
[19] (Wu Ruizhu, Li Hanjing, Lv Huihua, et al. Construction of Parallel Corpus of Chinese and Sign Language for ELAN[J]. Journal of Chinese Information Processing, 2019, 33(2): 43-50.)
[20] 黄晓晓. 基于情景语料库的自然手语构词研究[D]. 南京: 南京师范大学, 2012.
[20] (Huang Xiaoxiao. Study of Natural Sign Language Word Formation Based on Situational Corpus[D]. Nanjing: Nanjing Normal University, 2012.)
[21] 周闯. 基于中文分词的聋校小学记事文手语语料库构建研究[D]. 武汉: 华中师范大学, 2019.
[21] (Zhou Chuang. Research on the Construction of Deaf Primary School Text Corpus Based on Chinese Word Segmentation Technology[D]. Wuhan: Central China Normal University, 2019.)
[22] Stokoe W C. Sign Language Structure[M]. Buffalo: University of Buffalo Press, 1960.
[23] Liddell S K, Johnson R E. American Sign Language: The Phonological Base[J]. Sign Language Studies, 1989, 64(1): 195-277.
[24] Sandler W. Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language[M]. Dordrecht, Holland: Foris Publications, 1989.
[25] Brentari D. A Prosodic Model of Sign Language Phonology[M]. Cambridge, Mass: MIT Press, 1998.
[26] Tang G. Hong Kong Sign Language: A Trilingual Dictionary with Linguistic Descriptions[M].The Chinese University Press, 2007.
[27] Battison R M, Baird E. Lexical Borrowing in American Sign Language[OL]. [2022-01-01] https://api.semanticscholar.org/CorpusID:60545823.
[28] 张吉生, 伍艳红. 上海手语的底层手型与特征赋值[J]. 当代语言学, 2018, 20(4): 572-586.
[28] (Zhang Jisheng, Wu Yanhong. The Underlying Handshapes and Their Feature Specification of Shanghai Sign Language[J]. Contemporary Linguistics, 2018, 20(4): 572-586.)
[29] 骆维维. 《中国手语》手形研究[D]. 北京: 北京师范大学, 2008.
[29] (Luo Weiwei. Study on the Handshape of Chinese Sign Language[D]. Beijing: Beijing Normal University, 2008.)
[30] 衣玉敏. 上海手语的语音调查报告[D]. 上海: 复旦大学, 2008.
[30] (Yi Yumin. The Survey of the Phonology of Shanghai Sign Language[D]. Shanghai: Fudan University, 2008.)
[31] ELAN (Version 6.2)[DB/OL]. [2022-01-24]. https://archive.mpi.nl/tla/elan.
[1] Zhao Meng, Wang Hao, Li Xiaomin. Recognition of Emotions and Analysis of Emotional Changes in Chinese Folk Songs[J]. 数据分析与知识发现, 2023, 7(7): 111-124.
[2] Liu Yang, Zhang Wen, Hu Yi, Mao Jin, Huang Fei. Hotel Stock Prediction Based on Multimodal Deep Learning[J]. 数据分析与知识发现, 2023, 7(5): 21-32.
[3] Zhang Yu, Zhang Haijun, Liu Yaqing, Liang Kejin, Wang Yueyang. Multimodal Sentiment Analysis Based on Bidirectional Mask Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[4] Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[5] Yu Bengong, Ji Xiaohan. Detecting Multimodal Sarcasm Based on ADGCN-MFM[J]. 数据分析与知识发现, 2023, 7(10): 85-94.
[6] Chen Yuanyuan, Ma Jing. Detecting Multimodal Sarcasm Based on SC-Attention Mechanism[J]. 数据分析与知识发现, 2022, 6(9): 40-51.
[7] Shi Yunmei, Yuan Bo, Zhang Le, Lv Xueqiang. IMTS: Detecting Fake Reviews with Image and Text Semantics[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
[8] Fan Tao, Wang Hao, Li Yueyan, Deng Sanhong. Classifying Images of Intangible Cultural Heritages with Multimodal Fusion[J]. 数据分析与知识发现, 2022, 6(2/3): 329-337.
[9] Fu Zhu, Ding Weike, Guan Peng, Ding Xuhui. Knowledge Description Framework for Foreign Patent Documents Based on Knowledge Meta[J]. 数据分析与知识发现, 2022, 6(2/3): 263-273.
[10] Li Gang, Zhang Ji, Mao Jin. Social Media Image Classification for Emergency Portrait[J]. 数据分析与知识发现, 2022, 6(2/3): 67-79.
[11] Zhang Yongwei,Liu Ting,Liu Chang,Wu Bingxin,Yu Jingsong. Text Retrieval Based on Syntactic Information[J]. 数据分析与知识发现, 2022, 6(11): 25-37.
[12] Cai Miaozhi, Li Xiaoying, Zhao Jiawei, Feng Fengxiang, Ren Huiling. Disease Knowledge Discovery Based on SPO Predications[J]. 数据分析与知识发现, 2022, 6(1): 134-144.
[13] Liu Wenbin, He Yanqing, Wu Zhenfeng, Dong Cheng. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
[14] Liang Jiwen,Jiang Chuan,Wang Dongbo. Chinese-English Sentence Alignment of Ancient Literature Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2020, 4(9): 123-132.
[15] Haixia Sun,Panpan Deng,Jiao Li,Liu Shen,Qing Qian. Automatic Concept Update Strategy Towards Heterogeneous Terminology Integration[J]. 数据分析与知识发现, 2020, 4(1): 121-130.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn