,WEDeepT3: predicting type Ⅲ secreted effectors based on word embedding and deep learning

来源 :定量生物学(英文版) | 被引量 : 0次 | 上传用户:Ricky_C
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background:The type Ⅲ secreted effectors (T3SEs) are one of the indispensable proteins in the growth and reproduction of Gram-negative bacteria.In particular,the pathogenesis of Gram-negative bacteria depends on the type Ⅲ secreted effectors,and by injecting T3SEs into a host cell,the host cell’s immunity can be destroyed.The high diversity of T3SE sequences and the lack of defined secretion signals make it difficult to identify and predict.Moreover,the related study of the pathological system associated with T3SE remains a hot topic in bioinformatics.Some computational tools have been developed to meet the growing demand for the recognition of T3SEs and the studies of type Ⅲll secretion systems (T3SS).Although these tools can help biological experiments in certain procedures,there is still room for improvement,even for the current best model,as the existing methods adopt hand-designed feature and traditional machine leing methods.Methods:In this study,we propose a powerful predictor based on deep leing methods,called WEDeepT3.Our work consists mainly of three key steps.First,we train word embedding vectors for protein sequences in a large-scale amino acid sequence database.Second,we combine the word vectors with traditional features extracted from protein sequences,like PSSM,to construct a more comprehensive feature representation.Finally,we construct a deep neural network model in the prediction of type Ⅲ secreted effectors.Results:The feature representation of WEDeepT3 consists of both word embedding and position-specific features.Working together with convolutional neural networks,the new model achieves superior performance to the state-of-the-art methods,demonstrating the effectiveness of the new feature representation and the powerful leing ability of deep models.Conclusion:WEDeepT3 exploits both semantic information of k-mer fragments and evolutional information of protein sequences to accurately differentiate between T3SEs and non-T3SEs.WEDeepT3 is available at bcmi.sjtu.edu.cn/~yangyang/WEDeepT3.html.
其他文献
2014年一季度,中国进出口总值9658.8亿美元,同比下降1%;其中,出口4913.1亿美元,下降3.4%。当前,我国面临的外贸竞争日益激烈,仅2013年上半年,就有15个国家和地区对我发起39起
期刊
Background:Synthetic microbial consortia are conglomerations of genetically engineered microbes programmed to cooperatively bring about population-level phenoty
期刊
Sequence-specific binding by transcription factors (TFs) plays a significant role in the selection and regulation of target genes. At the protein:DNA interface,
试验于2007~2009年在长江中游江西双季稻地区的袁州、鄱阳两地进行,以当地主推品种先农31、陆两优28为材料,通过大面积定点跟踪调查,对中产(产量≤500kg/666.7m2)、高产(产量:500-600kg/666.7m2)、超高产(产量≥600 kg/666.7m2)三种类型群体的群体特征和氮素吸收特征进行了研究。主要结果如下:1、早稻超高产与高产和中产相比,早稻超高产群体形成特征主要为: