论文部分内容阅读
在互联网海量的信息处理上,采用定向追踪的方法,对网络科技信息进行采集、设计、纳入系统框架等工作,是互联网科技信息采集整理系统的主要功能,本文对基于网页分块的科技信息采集系统的设计与实现展开论述,详细论证网页分块、数据消重等关键技术的实现策略,论证这个系统的优势以及推广后能够带给科技研究领域的便捷。
In the mass information processing of the Internet, the method of directional tracking is adopted to collect, design and incorporate the network science and technology information into the system framework and so on. It is the main function of the Internet science and technology information collecting and finishing system. In this paper, System design and implementation, detailed demonstration web page block, data de-duplication and other key technologies to achieve strategies to demonstrate the advantages of this system and to promote the promotion of science and technology in the field of convenience.