,Current challenges and solutions of de novo assembly

来源 :定量生物学(英文版) | 被引量 : 0次 | 上传用户:ice588
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background:Next-generation sequencing (NGS) technologies have fostered an unprecedented proliferation of highthroughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads.However,numerous technical or computational challenges in de novo assembly still remain,although many new ideas and solutions have been suggested to tackle the challenges in both experimental and computational settings.Results:In this review,we first briefly introduce some of the major challenges faced by NGS sequence assembly.Then,we analyze the characteristics of various sequencing platforms and their impact on assembly results.After that,we classify de novo assemblers according to their frameworks (overlap graph-based,de Bruijn graph-based and string graph-based),and introduce the characteristics of each assembly tool and their adaptation scene.Next,we introduce in detail the solutions to the main challenges of de novo assembly of next generation sequencing data,single-cell sequencing data and single molecule sequencing data.At last,we discuss the application of SMS long reads in solving problems encountered in NGS assembly.Conclusions:This review not only gives an overview of the latest methods and developments in assembly algorithms,but also provides guidelines to determine the optimal assembly algorithm for a given input sequencing data type.
其他文献
1985年左右,以《黄土地》和《野山》为代表的中国新电影在国际上获奖,打开了中国电影海外传播的新局面,这些影片所提供的性别和民俗、情色和景观、传统和现代的杂糅叙事,成为
《源代码》作为一部具有特殊时空穿越意味的影片,反映了主人公史蒂文上尉在国家意志背后的操纵下借助代号“源代码”的意识移植的高科技手段,通过在同样的空间里,时间的反复
百萨偃麦草{Thinopyrum bessarabicum Love,2n=2x=14, JJ or EbEb)具有很强的耐盐性并抗多种小麦病害,是小麦改良的重要基因资源。准确识别百萨偃麦草7条染色体,开发各条染色体特异分子标记,并据此选育小麦中国春-百萨偃麦草易位系是转移和利用百萨偃麦草有益基因的重要途径。本研究利用百萨偃麦草基因组DNA、串联重复序列pSc119.2和45S rDNA作探针,对
Hexagonal close-packed Ni nanoparticles were synthesized using a heat-treating technique with the precursors prepared by the sol-gel method.The synthesis condit
Background: Synthetic biology has attracted enormous attention in recent years.A key focus of synthetic biology is to utilize modular biological building blocks
Background: Multi-view-omics datasets offer rich opportunities for integrative analysis across genomic,transcriptomic,and epigenetic data platforms.Statistical
2007年在新疆呼图壁县大丰镇一块连作八年的棉田上分别采用直播和春小麦收获后复播两种不同的种植模式种植绿肥,研究不同种植方式绿肥的生物量、植株养分含量及不同绿肥翻压后对土壤的养分供给状况;2008年在已翻压绿肥的土壤上全部种植棉花,研究不同绿肥茬口对棉花的生育性状及棉花产量的影响。主要研究结果如下:1、春播绿肥生物量对土壤肥力的影响:草木樨、毛苕子、沙打旺三种绿肥中均以草木樨的生物量和植株养分含量
Background:Since biological systems are complex and often involve multiple types of genomic relationships,tensor analysis methods can be utilized to elucidate t