Comparative genome analysis using sample-specific string detection in accurate long reads

Parsoa Khorsand, Luca Denti, Human Genome Structural Variant Consortium, Paola Bonizzoni, Rayan Chikhi, Fereydoun Hormozdiari

Bioinformatics Advances

Motivation Comparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include the discovery of genomic differences segregating in populations, case-control analysis in common diseases and diagnosing rare disorders. With the current progress of accurate long-read sequencing technologies (e.g. circular consensus sequencing from PacBio sequencers), we can dive into studying repeat regions of the genome (e.g. segmental duplications) and hard-to-detect variants (e.g. complex structural variants).

Results We propose a novel framework for comparative genome analysis through the discovery of strings that are specific to one genome (‘samples-specific’ strings). We have developed a novel, accurate and efficient computational method for the discovery of sample-specific strings between two groups of WGS samples. The proposed approach will give us the ability to perform comparative genome analysis without the need to map the reads and is not hindered by shortcomings of the reference genome and mapping algorithms. We show that the proposed approach is capable of accurately finding sample-specific strings representing nearly all variation (>98%) reported across pairs or trios of WGS samples using accurate long reads (e.g. PacBio HiFi data).

Differential stress responsiveness determines intraspecies virulence heterogeneity and host adaptation in Listeria monocytogenes

Nature Microbiology Lukas Hafner, Enzo Gadin, Lei Huang, Arthur Frouin, Fabien Laporte, Charlotte Gaultier, Afonso Vieira, Claire Maudet,...

Assessing the effect of model specificationand prior sensitivity on Bayesian tests oftemporal signal

PLOS COMPUTATIONAL BIOLOGY John H. Tay, Arthur Kocher, Sebastian Duchene* Abstract Our understanding of the evolution of many microbes...

Expanding the diversity of origin of transfer-containing sequences inmobilizable plasmids

Nature Microbiology Manuel Ares-Arroyo, Amandine Nucci & Eduardo P. C. Rocha Abstract Conjugative plasmids are important drivers of...

Comparative genome analysis using sample-specific string detection in accurate long reads

Recent Posts

Kommentare