基于单文档的上下文查询信息抽取*

doi:10.11925/infotech.1003-3513.2006.10.07

现代图书情报技术

2006, Vol. 1

Issue (10): 30-33 https://doi.org/10.11925/infotech.1003-3513.2006.10.07

信息检索技术

本期目录 | 过刊浏览 | 高级检索

基于单文档的上下文查询信息抽取*

杭月芹¹ 姚滢¹ 沈洁²

¹(南通大学计算机科学与技术学院南通 226006)
²(扬州大学信息工程学院计算机科学与工程系扬州 225009)

Towards Context Query Information Extraction Based on Single Document

Hang Yueqin¹ Yao Ying¹ Shen Jie²

¹(Institute of Computer Science and Technology, Nantong University, Nantong 226006, China)
²(Department of Computer Science and Engineering, College of Information Engineering, Yangzhou University, Yangzhou 225009, China)

摘要
参考文献
相关文章
Metrics

全文:
输出: BibTeX | EndNote (RIS)

摘要

提出一种结合全局分析和局部分析从单篇文档中抽取查询信息的算法。利用全局分析提取用户的查询兴趣，通过局部分析消除查询词的歧义性。实验结果表明，该方法能较全面反映用户查询的上下文信息，提高查询的相关度。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	杭月芹
	姚滢
	沈洁

关键词 ：检索偏爱度, 信息抽取, 上下文信息, 查询扩展

Abstract：

Based on single document, this paper puts foward an approach in which global analysis is combined with local analysis is proposed to extract the information of the query. By global analysis, keywords are extracted from the whole document to reflect the user’s research preference. While in local analysis step, query is disambiguated by extracting keywords from the text that is around the marked query. The results of the experiment show that the method above can reflect the query information more comprehensive and improve the relevance of the information retrieval.

Key words： Search preference Information extraction Context information Query expansion

收稿日期: 2006-05-24 出版日期: 2006-10-25

TP301.6

基金资助:

*本文系江苏省高校自然科学基金资助项目“智能分布Web信息处理研究”（项目编号：02KJB520013）的研究成果之一。

通讯作者: 杭月芹 E-mail: yueqinhang@163.com

作者简介: 杭月芹,姚滢,沈洁

引用本文:

杭月芹,姚滢,沈洁 . 基于单文档的上下文查询信息抽取*[J]. 现代图书情报技术, 2006, 1(10): 30-33.
Hang Yueqin,Yao Ying,Shen Jie . Towards Context Query Information Extraction Based on Single Document. New Technology of Library and Information Service, 2006, 1(10): 30-33.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2006.10.07 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2006/V1/I10/30

1Steve Lawrence, Context in Web Search, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2000: 25-32
2Xuehua Shen,Chengxiang Zhai Exploiting Query History for Document Ranking in Interactive Information Retrieval，In : Proceedings of SIGIR'03 (Poster), 2003: 377-378
3Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan A, Wolfman G, Ruppin E. Placing search in context: the concept revisited. ACM Transactions on Information Systems, 2002, 20(1): 116-131
4Mohammed A Razek, Claude Frasson, Marc Kaltenbach A Context-Based Information Agent for Supporting Intelligent Distance Learning Environments, Twelfth International World Wide Web Conference, Budapest, 2003
5Fürnkranz J. A study using n-grams features for text categorization Technical Report OEFAI-TR-98-30, 1998: 1-10
6Matsuo Y, Ishizuka M. Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information, International Journal on Artificial Intelligence Tools, 2004, 13 (1): 157-169
7Kenneth Ward Church， Patrick Hanks Word association norms, mutual information and lexicography, In Proceeding of ACL 27, 1989: 76-83
8Porter M. An algorithm for suffix stripping Program, 1980, 14(3):130-137
9Alan Turing Computing machinery and intelligence, Mind 59, 1950: 433-460
10常学将，胡文明等译数理统计初级教程太原：山西人民出版社，1986

[1]	谭荧, 唐亦非. 基于指代消解的引文内容抽取研究^*[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2]	黄名选,蒋曹清,卢守东. 基于词嵌入与扩展词交集的查询扩展^*[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[3]	陶玥,余丽,张润杰. 科技文献中短语级主题抽取的主动学习方法研究^*[J]. 数据分析与知识发现, 2020, 4(10): 134-143.
[4]	刘志强,都云程,施水才. 基于改进的隐马尔科夫模型的网页新闻关键信息抽取^*[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[5]	章成志,李铮. 基于学术论文全文的创新研究评价句抽取研究 ^*[J]. 数据分析与知识发现, 2019, 3(10): 12-18.
[6]	牟冬梅, 金姗, 琚沅红. 基于文献数据的疾病与基因关联关系研究^*[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[7]	段宇锋,黄思思. 中文植物物种多样性描述文本的信息抽取研究^*[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[8]	刘伟, 王星, 宋培彦. 同义词抽取结果的噪音清洗方法研究[J]. 现代图书情报技术, 2015, 31(6): 64-70.
[9]	刘峰, 李煜, 吕学强, 李卓. 查询主题分类方法研究[J]. 现代图书情报技术, 2015, 31(4): 10-17.
[10]	李湘东, 霍亚勇, 黄莉. 图书网页的自动识别及书目信息抽取研究[J]. 现代图书情报技术, 2014, 30(4): 71-77.
[11]	刘雅静, 王衍喜, 郝丹, 周津慧. 机构知识库支撑科研服务方法研究[J]. 现代图书情报技术, 2014, 30(3): 1-7.
[12]	翟东升, 张欣琦, 张杰, 康宁. 分布式专利信息抽取系统设计与构建[J]. 现代图书情报技术, 2013, 29(7/8): 114-121.
[13]	张晗, 刘双梅. 中心度指标对语义述谓网络概念抽取的比较分析——以疾病治疗学研究为例[J]. 现代图书情报技术, 2013, (6): 30-35.
[14]	黄勋, 游宏梁, 于洋. 关系抽取技术研究综述[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[15]	何琳, 何娟, 沈耕宇, 杨波, 黄水清. 一种通过文本挖掘发现实时定量聚合酶链式反应实验内参基因的方法研究[J]. 现代图书情报技术, 2012, 28(7): 109-114.

Viewed

Full text

Abstract

Cited

Shared

Discussed