Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (12): 1-5    DOI: 10.11925/infotech.1003-3513.2007.12.01
article Current Issue | Archive | Adv Search |
Review on the Application Reasearch of ETL in Digital Library
Huang YongwenLi Guangjian2
1(National Science Library,Chinese Academy of Sciences,Beijing  100080,China)
2(School of Management,Beijing Normal University, Beijing  100875,China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

The paper introduces some researches on ETL application in digital libraries,and analyzes classification and application field of ETL in resources construction,user service,resources sharing,system interoperability of digital libraries.

Key wordsDigital library      ETL application      Information extraction      Data cleaning     
Received: 15 October 2007      Published: 25 December 2007
: 

G250.76

 
Corresponding Authors: Huang Yongwen     E-mail: hyongwen@mail.las.ac.cn
About author:: Huang Yongwen,Li Guangjian

Cite this article:

Huang Yongwen,Li Guangjian. Review on the Application Reasearch of ETL in Digital Library. New Technology of Library and Information Service, 2007, 2(12): 1-5.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.12.01     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I12/1

[1] Simitsis A,Vassiliadis P,Sellis T.Optimizing ETL Processes in Data Warehouses[C].21st International Conference on Data Engineering (ICDE’05),2005 :564-575.
[2] Bolasco S,Canzonetti A,Federico M C,et al.Understanding Text Mining:A  Pragmatic Approach[C].In:Proceedings of the NEMIS 2004 Final Conference,2005:31-50.
[3] 张智雄.信息抽取技术及其在数字图书馆中的应用前景分析[J].现代图书情报技术,2004(6):1-5,23.
[4] 刘鲁红,刘力强,胡亚军.信息抽取技术在数字图书馆中的应用研究[J].情报理论与实践,2005,28(3):321-324.
[5] 刘剑兰,朱东华.信息抽取技术在情报监测中的应用[J].情报学报,2004,23(6):661-666.
[6] Jones S,Paynter G W.Automatic Extraction of Document Keyphrases for Use in Digital Iibraries:Evaluation and Applications[J].Journal of the American Society for Information Science and Technology,2002,53(2): 653-677.
[7] Wellner B,McCallum A,Peng F C,et al.An Integrated,Conditional Model of Information Extraction and Coreference with Application to Citation Matching[C/OL].[2007-05-20].Conference on Uncertainty in Artificial Intelligence (UAI),2004. http://www.cs.umass.edu/~mccallum/papers/integrated04uai.pdf.
[8] 李朝光,张铭,邓志鸿,等.论文元数据信息的自动抽取[J].计算机工程与应用,2002,38(21):189-191,235.
[9] 李向阳,张亚非.一种网上图书信息抽取方法[J].情报学报,2004,23(6):655-660.
[10] 胡金化,胡运发,周益群,等.面向中文文本数据库的信息抽取机制[J].小型微型计算机系统,2002,23(10):1161-1164.
[11] 奚伟鹏,李昕,蒋饥,等.面向网上论坛的信息抽取技术[J].计算机工程,2005,31(4):66-68.
[12] 冯伟华,苗长芬.基于Web的网页信息抽取方法的研究[J].洛阳工业高等专科学校学报,2005,15(3):30-31.
[13] 郭志红.基于Web资源的信息抽取技术[J].情报科学,2002,20(12):1282-1284.
[14] 王亮,朱征宇.基于扩展标记图的Web信息抽取器[J].计算机工程,2005,31(8):159-161,191.
[15] 张丙奇,姜吉发.企业相关信息抽取技术研究与系统实现[J].微电子学与计算机,2004,21(1):1-6.
[16] Bergmark D,Phempoonpanich P,Zhao S M.Scraping the ACM Digital Iibrary[J].ACM SIGIR Forum,2001,35(2):1-7.
[17] Zhang W D,Song Y J.Research on PDF Documents Information Extraction System Based-on XML[EB/OL].[2007-05-20].http://adt.caul.edu.au/etd2005/papers/057Zhang.pdf.
[18] 郭瑞华,张玉莉.语义Web上DC元数据的描述及抽取技术[J].现代情报,2005,25(6):212-214.
[19] 刘金红,夏阳,陆余良.基于Ontology的网络元数据抽取系统的研究与实现[J].安徽电子信息职业技术学院学报,2004,3(5):10-13.
[20] 陆科进,李新颖.基于Ontology的文本信息抽取[J].计算机应用研究,2003,20(7):46-48.
[21] 廖乐健,曹元大,李新颖.基于Ontology的信息抽取[J].计算机工程与应用,2002,38(23):110-113.
[22] Sarawagi S,Srinivasan S,Vydiswaran V G,et al.Resolving Citations in a Paper Repository[J].ACM SIGKDD Explorations Newsletter,2003,5(2):156-157.
[23] Ayres F H,Huggill J W,Yannakoudakis E J.The Universal Standard Bibliographic Code (USBC):Its Use for Clearing,Merging and Controlling Large Databases[J].Program,1998,22(2):117-132.
[24] Haseebulla M K,Kurt M,Mohammad Z.Similarity and Duplicate Detection System for an OAI Compliant Federated Digital Library[C].The 9th European Conference on Research and Advanced Technology for Digital Libraries,2005: 531-532.
[25] Shen R,Wang J,Edward A F.A Lightweight Protocol Between Digital Libraries and Visualization Systems[EB/OL].[2007-05-25].http://vw.indiana.edu/visual02/Shen.pdf.
[26] Griffin S,Merriman J.E-learning and the Digital Library——A Report on Collaboration Between IMS and OKI[EB/OL].[2007-05-25].CNI Fall Task Force Meeting,2002. http://www.cni.org/tfms/2002b.fall/PowerPoint/PPT-E-Learning.ppt.

[1] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2] Wang Yi,Shen Zhe,Yao Yifan,Cheng Ying. Domain-Specific Event Graph Construction Methods:A Review[J]. 数据分析与知识发现, 2020, 4(10): 1-13.
[3] Tao Yue,Yu Li,Zhang Runjie. Active Learning Strategies for Extracting Phrase-Level Topics from Scientific Literature[J]. 数据分析与知识发现, 2020, 4(10): 134-143.
[4] Zhiqiang Liu,Yuncheng Du,Shuicai Shi. Extraction of Key Information in Web News Based on Improved Hidden Markov Model[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[5] Chengzhi Zhang,Zheng Li. Extracting Sentences of Research Originality from Full Text Academic Articles[J]. 数据分析与知识发现, 2019, 3(10): 12-18.
[6] Mu Dongmei,Jin Shan,Ju Yuanhong. Finding Association Between Diseases and Genes from Literature Abstracts[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[7] Qi Yunfei,Zhao Yuxiang,Zhu Qinghua. Linked Data for Mobile Visual Search System of Digital Library[J]. 数据分析与知识发现, 2017, 1(1): 81-90.
[8] Hong Liang,Qian Chen,Fan Xing. Context-aware Recommendation System for Mobile Digital Libraries[J]. 现代图书情报技术, 2016, 32(7-8): 110-119.
[9] Liu Jian,Bi Qiang,Ma Zhuo. Assessment of Digital Library’s Micro-services: An Empirical Study[J]. 现代图书情报技术, 2016, 32(5): 22-29.
[10] Yufeng Duan,Sisi Huang. Information Extraction from Chinese Plant Species Diversity Description Text[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[11] Liu Wei, Wang Xing, Song Peiyan. A Noise Cleaning Method for Synonym Extraction Results[J]. 现代图书情报技术, 2015, 31(6): 64-70.
[12] Liu Huoyu, Wang Dongbo. Research and Implementation of Data Preprocessing Oriented to Paper Similarity Detection[J]. 现代图书情报技术, 2015, 31(5): 50-56.
[13] Jiang Chuntao. Automatic Annotation of Bibliographical References in Chinese Patent Documents[J]. 现代图书情报技术, 2015, 31(10): 81-87.
[14] Chen Guo, Hu Changping. Research on the Structural Features of Keyword Network of Scientific Research Areas:An Empirical Study of LIS[J]. 现代图书情报技术, 2014, 30(7): 84-91.
[15] Xiong Yongjun, Yuan Xiaoyi. Design and Implementation of Automatic Monitoring System about Library Document Database Running State[J]. 现代图书情报技术, 2014, 30(7): 127-132.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn