论文部分内容阅读
紫杉醇是从紫杉或红豆杉树中提取的一种天然抗癌原料药,具有独特的抗癌机理。由于紫杉醇的种种限制,开发具有更高抗癌活性的类紫杉醇药物具有广阔的前景。紫杉烷二萜是以紫杉醇为母体,通过对其结构的不断修饰得到的一些二代紫杉醇类化合物。本文选用30个结构多样的紫杉烷二帖类化合物作为数据集,随机选取其中24个作为训练集,其它分子作为检验集,采用多元线性回归法(MLR)及主成分回归分析法(PCA)对每个化合物的195个分子参数进行回归分析,分别建立了定量构效关系的最优预测模型;并用检验集检验了所建模型的预测能力。结果表明,多元线性回归法所建模型与主成分回归法所建模型相对比,发现逐步筛选法为最优建模方法。该方法所建模型统计结果良好(R=0.782,SEE=0.202),应用于检验集时结果也比较令人满意(R=0.764,SEP=0.114),模型表现出较强的可靠性和预测性。模型的建立和主要影响因素的确定有助于指导新型紫杉醇类似物药物的筛选和研发。
Paclitaxel is a natural anticancer drug that has been extracted from yew or yew trees and has a unique anticancer mechanism. Due to various limitations of paclitaxel, the development of paclitaxel drugs with higher anticancer activity has broad prospects. Taxane diterpene is paclitaxel as the parent, through the structure of the continuous modification of some of the second-generation paclitaxel compounds. In this paper, 30 taxane compounds with various structures were selected as dataset, 24 of which were randomly selected as training datasets and other molecules as test dataset. Multiple linear regression (MLR) and principal component analysis (PCA) Regression analysis was carried out on 195 molecular parameters of each compound to establish the optimal prediction model of quantitative structure-activity relationship. The test set was used to test the predictive ability of the model. The results show that the model established by multivariate linear regression method is compared with the model established by principal component regression and found that the step-by-step screening method is the optimal modeling method. The model has good statistical results (R = 0.782, SEE = 0.202), and the results are satisfactory when applied to the test set (R = 0.764, SEP = 0.114). The model shows strong reliability and predictability . The establishment of the model and the determination of the main influencing factors help to guide the screening and development of new paclitaxel analogs.