论文部分内容阅读
为在应用粗糙集理论处理数据时,对连续属性进行离散化预处理,采用k均值算法对连续属性进行离散化的方法,将属性无监督聚类成两类.通过在UCI数据库上选取的4组数据进行实验,首先离散化,再通过粗糙集约简,最后使用k NN(k=10)分类器,并和其他两种离散化方法进行对比.研究结果表明:该方法能够提高离散化的效率,降低实验的复杂度,并有效减少断点数.
In order to process continuous attributes discretely by using rough set theory and discretization of continuous attributes by k-means algorithm, unsupervised clustering of attributes into two types is carried out.With the selection of 4 Firstly, the data is discretized, and then reduced by rough sets, and then the k NN (k = 10) classifier is used to compare with the other two discretization methods.The results show that this method can improve the efficiency of discretization , Reduce the complexity of the experiment, and effectively reduce the number of breakpoints.