论文部分内容阅读
[目的/意义]针对关键词共现方法识别领域研究热点过程中数据清洗进行理论研究与探索,以辅助科研工作者准确识别领域研究热点。[方法/过程]在文献调研的基础上,阐述数据清洗的定义和对象,并分析脏数据产生的原因和影响,进而制定数据清洗的步骤和方案,并采用实证研究方法对数据清洗的效果和方案的可行性进行验证。[结果/结论]研究结果表明该数据清洗方案能够提高研究热点识别的准确性,从而证明了该方案的可行性。
[Purpose / Significance] Theoretical research and exploration on data cleaning during the research hotspot of keyword co-occurrence method recognition are carried out to assist researchers to accurately identify research hotspots in the field. [Methods / Processes] Based on the literature survey, the definition and object of data cleaning are expounded, and the causes and effects of dirty data generation are analyzed. Then the steps and schemes of data cleaning are formulated and the effects of data cleaning by empirical research methods and Feasibility of the program to verify. [Results / Conclusions] The results show that the data cleaning scheme can improve the accuracy of hot spot recognition, which proves the feasibility of the scheme.