基于多重过滤策略的科技文献自动标引方法研究

来源 :情报理论与实践 | 被引量 : 0次 | 上传用户:hanfeizifly
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
文章提出一种基于多重过滤策略的科技文献自动标引方法,该方法不依赖于大规模训练语料,很容易作为处理模块嵌入到其他文本处理环节中,实验结果验证了方法的可行性。另外,还提出了一种基于二次文献的标引词评价方法。该方法虽然严重依赖于二次文献中给出的摘要和关键词的质量,但在人力和物力资源不足以支持建立一个高质量测试集的条件下是有价值的,制定更加合理有效的评测方案势在必行。 This paper proposes a method of automatic indexing of scientific and technical documents based on multiple filtering strategies. This method does not depend on large-scale training corpus and can easily be embedded into other text processing steps as a processing module. The experimental results verify the feasibility of the method. In addition, an evaluation method of index words based on secondary documents is also proposed. Although this method relies heavily on the quality of the digests and keywords given in the secondary literature, it is valuable to establish a more rational and effective evaluation program where human and material resources are not sufficient to support the establishment of a high-quality test set Imperative.
其他文献
随着市场竞争格局的逐渐形成,电力企业由原来相对垄断的形势逐步转向市场化。为了使企业具有持续的竞争力和价值创造力,目前,华东送变电工程公司正向着规范化、信息化、职业化、