Constructing Multimodal Corpus of Chinese Vocabulary for Sign Language Linguistics

doi:10.11925/infotech.2096-3467.2022.1262

Data Analysis and Knowledge Discovery

2023, Vol. 7

Issue (10): 144-155 DOI: 10.11925/infotech.2096-3467.2022.1262

Current Issue | Archive | Adv Search

Constructing Multimodal Corpus of Chinese Vocabulary for Sign Language Linguistics

Zhang Yanqiong^1,^2,³(

),Zhu Zhaosong^1,²,Zhao Xiaochi²

¹School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, China
²Braille and Sign Language Research Center, Nanjing Normal University of Special Education, Nanjing 210038, China
³Jiangsu Provincial Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China

Download: PDF (3640 KB) HTML ( 8 )
Export: BibTeX | EndNote (RIS)

Abstract

[Objective] This paper extracts and organizes knowledge from multimodal sign language resources and constructs a corpus for related research. It meets the public’s urgent demands to obtain sign language knowledge. [Context] The new multimodal corpus is suitable for mining sign language knowledge, which addresses low information levels, disordered resource organization, and difficult utilization of sign language knowledge. [Methods] Firstly, we constructed the multi-modal feature annotation system for sign language vocabulary. Secondly, we formulated the feature coding scheme of the vocabulary and implemented multi-level annotation. Finally, we established the graph model for sign language vocabulary and the Neo4j database to store and visualize. [Results] The vocabulary data are from the national sign language vocabulary corpus. Over 10 000 sign language vocabulary multimodal annotation has been completed, and we realized the whole process of constructing a multimodal corpus. [Conclusions] The new corpus increases knowledge retrieval of hand shape, movement, expression, and posture, which greatly improves the usability of the sign language corpus.

Key words： Chinese Sign Language Vocabulary Corpus Knowledge Organization Multimodal

Received: 27 November 2022 Published: 30 March 2023

ZTFLH:	TP391
	G250

Fund:National Social Sciences Fund of China(20BTQ065)

Corresponding Authors: Zhang Yanqiong，ORCID：0000-0003-4372-1003，E-mail： zhangyanqiong@njts.edu.cn。

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yanqiong Zhang
	Zhaosong Zhu
	Xiaochi Zhao

Cite this article:

Zhang Yanqiong, Zhu Zhaosong, Zhao Xiaochi. Constructing Multimodal Corpus of Chinese Vocabulary for Sign Language Linguistics. Data Analysis and Knowledge Discovery, 2023, 7(10): 144-155.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1262 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I10/144

Construction Framework of Multimodal Corpus of Chinese Sign Language Vocabulary

Description Scheme of Phonological Features of Hong Kong Sign Language Vocabulary

Multimodal Feature Representation Model of Sign Language Vocabulary

Sign Language Multimodal Feature Labeling Content

Sign Language Type

Location Code

Sign Word “Painter” and Multimodal Information Annotation

Entity and Entity Relationship in the Multimodal Corpus of Sign Language Vocabulary

Example of Sign Language Annotation

Sign Word “Teacher”

Visualization of Sign Language Vocabulary Annotation Information Based on Neo4j

Example of Handshape Search （No. DEZ101）

Example of Entity Relationship

[1]	向安玲, 高爽, 彭影彤, 等. 知识重组与场景再构:面向数字资源管理的元宇宙[J]. 图书情报知识, 2022, 39(1): 30-38.
[1]	(Xiang Anling, Gao Shuang, Peng Yingtong, et al. Knowledge Reorganization and Scene Reconstruction: A Metaverse for Digital Resources Management[J]. Documentation, Information & Knowledge, 2022, 39(1): 30-38.)
[2]	张浩宇, 王天保, 李孟择, 等. 视觉语言多模态预训练综述[J]. 中国图象图形学报, 2022, 27(9): 2652-2682.
[2]	(Zhang Haoyu, Wang Tianbao, Li Mengze, et al. Comprehensive Review of Visual-Language-Oriented Multimodal Pre-Training Methods[J]. Journal of Image and Graphics, 2022, 27(9): 2652-2682.)
[3]	姚登峰. 手语计算概论[M]. 北京: 科学出版社, 2022.
[3]	(Yao Dengfeng. A Guide to Sign Language Computing[M]. Beijing: Science Press, 2022.)
[4]	邱云峰, 姚登峰, 李荣, 等. 中国手语语言学概论[M]. 北京: 中国国际广播出版社, 2018.
[4]	(Qiu Yunfeng, Yao Dengfeng, Li Rong, et al. Introduction to Chinese Sign Language Linguistics[M]. Beijing: China International Broadcasting Press, 2018.)
[5]	中国残疾人联合会. 《第二期国家手语和盲文规范化行动计划(2021-2025年)》[EB/OL](2021-11-29)[2022-11-26].https://www.cdpf.org.cn//zwgk/zcwj/wjfb/fe1a8761eb2d40bc9467179bdac0b551.htm.
[5]	(China Disabled Persons’ Federation. Second Phase National Action Plan for Standardization of Sign Language and Braille (2021-2025)[EB/OL](2021-11-29)[2022-11-26]. https://www.cdpf.org.cn//zwgk/zcwj/wjfb/fe1a8761eb2d40bc9467179bdac0b551.htm.)
[6]	赵晓驰, 任媛媛, 丁勇. 国家手语词汇语料库的建设与使用[J]. 中国特殊教育, 2017(1): 43-47.
[6]	(Zhao Xiaochi, Ren Yuanyuan, Ding Yong. On the Construction and Application of China’s Sign Language Vocabulary Corpus[J]. Chinese Journal of Special Education, 2017(1): 43-47.)
[7]	张帜. Neo4j权威指南[M]. 北京: 清华大学出版社, 2017.
[7]	(Zhang Zhi. Neo4j Authoritative Guide[M]. Beijing: Tsinghua University Press, 2017.)
[8]	Lucas C, Bayley R. Variation in ASL: The Role of Grammatical Function[J]. Sign Language Studies, 2005, 6(1): 38-75. doi: 10.1353/sls.2006.0005
[9]	Johnston T A, Schembri A. Australian Sign Language (Auslan): An Introduction to Sign Language Linguistics[M]. Cambridge, UK: Cambridge University Press, 2007.
[10]	Auslan Signbank[DB/OL]. [2022-11-26]. http://www.auslan.org.au/.
[11]	Caselli N K, Sehyr Z S, Cohen-Goldberg A M, et al. ASL-LEX: A Lexical Database of American Sign Language[J]. Behavior Research Methods, 2017, 49(2): 784-801. doi: 10.3758/s13428-016-0742-0 pmid: 27193158
[12]	Sehyr Z S, Caselli N, Cohen-Goldberg A M, et al. The ASL-LEX 2.0 Project: A Database of Lexical and Phonological Properties for 2,723 Signs in American Sign Language[J]. The Journal of Deaf Studies and Deaf Education, 2021, 26(2): 263-277. doi: 10.1093/deafed/enaa038
[13]	Fenlon J, Cormier K, Rentelis R, et al.BSL Signbank: A Lexical Database of British Sign Language[DB/OL]. [2022-11-26]. http://bslsignbank.ucl.ac.uk.
[14]	Schembri A, Fenlon J, Rentelis R, et al. British Sign Language Corpus Project: A Corpus of Digital Video Data and Annotations of British Sign Language[DB/OL]. [2022-11-26]. http://www.bslcorpusproject.org.
[15]	Fenlon J, Cormier K, Schembri A. Building BSL SignBank: The Lemma Dilemma Revisited[J]. International Journal of Lexicography, 2015, 28(2): 169-206. doi: 10.1093/ijl/ecv008
[16]	NGT Corpus[DB/OL]. [2022-11-26]. http://www.ru.nl/corpusngt/.
[17]	全国哲学社会科学工作办公室. 基于汉语和部分少数民族语言的手语语料库建设研究[R/OL]. [2022-11-26]. http://www.nopss.gov.cn/GB/352519/355466/.
[17]	(National Office for Philosophy and Social Sciences. Sign Language Corpus Research Based on Chinese and Some Minority Languages[R/OL]. [2022-11-26]. http://www.nopss.gov.cn/GB/352519/355466/.)
[18]	陈晓燕. 中国电视手语传译中的非手部策略——基于中国手语嘴部动作的研究[D]. 厦门: 厦门大学, 2014.
[18]	(Chen Xiaoyan. Strategies of Nonmanuals in Sign Language Interpreting on Chinese TV[D]. Xiamen: Xiamen University, 2014.)
[19]	吴蕊珠, 李晗静, 吕会华, 等. 面向ELAN软件的手语汉语平行语料库构建[J]. 中文信息学报, 2019, 33(2): 43-50.
[19]	(Wu Ruizhu, Li Hanjing, Lv Huihua, et al. Construction of Parallel Corpus of Chinese and Sign Language for ELAN[J]. Journal of Chinese Information Processing, 2019, 33(2): 43-50.)
[20]	黄晓晓. 基于情景语料库的自然手语构词研究[D]. 南京: 南京师范大学, 2012.
[20]	(Huang Xiaoxiao. Study of Natural Sign Language Word Formation Based on Situational Corpus[D]. Nanjing: Nanjing Normal University, 2012.)
[21]	周闯. 基于中文分词的聋校小学记事文手语语料库构建研究[D]. 武汉: 华中师范大学, 2019.
[21]	(Zhou Chuang. Research on the Construction of Deaf Primary School Text Corpus Based on Chinese Word Segmentation Technology[D]. Wuhan: Central China Normal University, 2019.)
[22]	Stokoe W C. Sign Language Structure[M]. Buffalo: University of Buffalo Press, 1960.
[23]	Liddell S K, Johnson R E. American Sign Language: The Phonological Base[J]. Sign Language Studies, 1989, 64(1): 195-277.
[24]	Sandler W. Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language[M]. Dordrecht, Holland: Foris Publications, 1989.
[25]	Brentari D. A Prosodic Model of Sign Language Phonology[M]. Cambridge, Mass: MIT Press, 1998.
[26]	Tang G. Hong Kong Sign Language: A Trilingual Dictionary with Linguistic Descriptions[M].The Chinese University Press, 2007.
[27]	Battison R M, Baird E. Lexical Borrowing in American Sign Language[OL]. [2022-01-01] https://api.semanticscholar.org/CorpusID:60545823.
[28]	张吉生, 伍艳红. 上海手语的底层手型与特征赋值[J]. 当代语言学, 2018, 20(4): 572-586.
[28]	(Zhang Jisheng, Wu Yanhong. The Underlying Handshapes and Their Feature Specification of Shanghai Sign Language[J]. Contemporary Linguistics, 2018, 20(4): 572-586.)
[29]	骆维维. 《中国手语》手形研究[D]. 北京: 北京师范大学, 2008.
[29]	(Luo Weiwei. Study on the Handshape of Chinese Sign Language[D]. Beijing: Beijing Normal University, 2008.)
[30]	衣玉敏. 上海手语的语音调查报告[D]. 上海: 复旦大学, 2008.
[30]	(Yi Yumin. The Survey of the Phonology of Shanghai Sign Language[D]. Shanghai: Fudan University, 2008.)
[31]	ELAN (Version 6.2)[DB/OL]. [2022-01-24]. https://archive.mpi.nl/tla/elan.

[1]	Zhao Meng, Wang Hao, Li Xiaomin. Recognition of Emotions and Analysis of Emotional Changes in Chinese Folk Songs[J]. 数据分析与知识发现, 2023, 7(7): 111-124.
[2]	Liu Yang, Zhang Wen, Hu Yi, Mao Jin, Huang Fei. Hotel Stock Prediction Based on Multimodal Deep Learning[J]. 数据分析与知识发现, 2023, 7(5): 21-32.
[3]	Zhang Yu, Zhang Haijun, Liu Yaqing, Liang Kejin, Wang Yueyang. Multimodal Sentiment Analysis Based on Bidirectional Mask Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[4]	Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[5]	Yu Bengong, Ji Xiaohan. Detecting Multimodal Sarcasm Based on ADGCN-MFM[J]. 数据分析与知识发现, 2023, 7(10): 85-94.
[6]	Chen Yuanyuan, Ma Jing. Detecting Multimodal Sarcasm Based on SC-Attention Mechanism[J]. 数据分析与知识发现, 2022, 6(9): 40-51.
[7]	Shi Yunmei, Yuan Bo, Zhang Le, Lv Xueqiang. IMTS: Detecting Fake Reviews with Image and Text Semantics[J]. 数据分析与知识发现, 2022, 6(8): 84-96.
[8]	Fan Tao, Wang Hao, Li Yueyan, Deng Sanhong. Classifying Images of Intangible Cultural Heritages with Multimodal Fusion[J]. 数据分析与知识发现, 2022, 6(2/3): 329-337.
[9]	Fu Zhu, Ding Weike, Guan Peng, Ding Xuhui. Knowledge Description Framework for Foreign Patent Documents Based on Knowledge Meta[J]. 数据分析与知识发现, 2022, 6(2/3): 263-273.
[10]	Li Gang, Zhang Ji, Mao Jin. Social Media Image Classification for Emergency Portrait[J]. 数据分析与知识发现, 2022, 6(2/3): 67-79.
[11]	Zhang Yongwei,Liu Ting,Liu Chang,Wu Bingxin,Yu Jingsong. Text Retrieval Based on Syntactic Information[J]. 数据分析与知识发现, 2022, 6(11): 25-37.
[12]	Cai Miaozhi, Li Xiaoying, Zhao Jiawei, Feng Fengxiang, Ren Huiling. Disease Knowledge Discovery Based on SPO Predications[J]. 数据分析与知识发现, 2022, 6(1): 134-144.
[13]	Liu Wenbin, He Yanqing, Wu Zhenfeng, Dong Cheng. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
[14]	Liang Jiwen,Jiang Chuan,Wang Dongbo. Chinese-English Sentence Alignment of Ancient Literature Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2020, 4(9): 123-132.
[15]	Haixia Sun,Panpan Deng,Jiao Li,Liu Shen,Qing Qian. Automatic Concept Update Strategy Towards Heterogeneous Terminology Integration[J]. 数据分析与知识发现, 2020, 4(1): 121-130.

Viewed

Full text

Abstract

Cited

Shared

Discussed