论文部分内容阅读
The human uridine diphosphate (UDP)-glucuronosyltransferase (UGT) enzyme family catalyzes the glucuronidation of the glycosyl group of a nucleotide sugar to an acceptor compound (substrate), which is the most common conjugation pathway that serves to protect the organism from the potential toxicity of xenobiotics.Moreover it could affect the pharmacological profile of a drug.Therefore it is important to identify the metabolically labile sites for glucuronidation.In the present study, Four types of statistical learning models were constructed to predict sites of glucuronidation based on 1,377 in vitro human UGT-catalyzed reactions, which can address four major site of metabolism (SOM) functional groups, i.e., aliphatic hydroxyl, aromatic hydroxyl, carboxylic acid, or amino nitrogen, respectively.According to the mechanism of glucuronidation, a series of "local" and "global" molecular descriptors characterizing the atomic reactivity, bonding strength or molecular geometrical shape, were calculated and selected with a genetic algorithm based feature selection approach.The constructed support vector machine (SVM) classification models show good prediction performance, with the balanced accuracy ranging from 0.88 to 0.96 on test set.For further validation, our models can successfully identify 84% of experimentally observed SOMs for an external test set containing 25 molecules.