论文部分内容阅读
该研究旨在建立随机森林算法鉴别和分类不同品牌夏桑菊颗粒,为多指标的复杂指纹图谱的鉴别提供有效的参考。采用高效液相法采集83批不同品牌的夏桑菊颗粒指纹图谱,比较主成分分析、偏最小二乘法-判别分析、随机森林等方法在处理不同分类样品复杂数据时的不同。结果表明本研究成功建立了83批不同品牌夏桑菊颗粒的指纹图谱;经过不同模式识别方法比较可得,主成分分析分析只能解释56.52%的方差贡献率,同时不能完全将样品分类;偏最小二乘法-判别分析优于主成分分析的结果,能达到一定的分离,解释总体方差贡献率63.43%;而随机森林法能够很好的将样品分为3类,且3类样本的10折交互验证准确率达到96.5%。因此,随机森林算法联合HPLC指纹图谱能够有效构建中药质量控制和分析体系。
The aim of this study was to establish a random forest algorithm to identify and classify the different brands of Xia Sang Ju particles and provide an effective reference for the identification of complex multi-index fingerprints. Eighty batches of fingerprints of Saxa chrysanthemum collected by HPLC were compared by HPLC. The differences between the methods of principal component analysis, partial least-squares-discriminant analysis and random forest were compared when dealing with complex data of different classification samples. The results showed that the fingerprints of 83 batches of Xiasangju granules were successfully established in this study. Compared with different pattern recognition methods, principal component analysis can only explain the contribution rate of 56.52%, and can not completely classify the samples. Multiplicative-discriminant analysis is superior to the results of principal component analysis and can reach a certain degree of separation, accounting for 63.43% of the total variance. The random forest method can well classify the samples into three types and 10 types of cross-validation Accuracy rate of 96.5%. Therefore, the random forest algorithm combined with HPLC fingerprinting can effectively construct the quality control and analysis system of Chinese medicine.