Small but with a huge genome: the challenge of sequencing the Apennine yellow-bellied toad (part 4 of 4) 04.08.2020

As you may have noticed in the previous posts 1-3 on Bombina pachypus sequencing, the protocols use short read sequencing approaches to genotype the fragments produced by the library step. These approaches are useful as they provide high amount of very accurate sequences, but at the same time the short length doesn’t allow to be fully aware of the order of fragments in the genome, nor of the phase of different fragments (OMNI-C approach provides some information in this sense, but at a wider scale). It is therefore fundamental to implement a third sequencing effort that provides longer reads to be used as a reference onto which anchor the shorter reads. For this purpose, we will use PacBio technology, that exploits a novel “real-time” nucleotide incorporation monitoring to produce reads 50+ kB long

Representation of the Pac Bio Single Molecule Real Time (SMRT) protocol used for highly accurate long read sequencing (figure from Rhoads and Au 2015).

At the end of the sequencing steps, all these different datasets produced by different approaches have to be combined. And the bioinformatician will start to work, assembling (putting together in the correct order) billions of pieces. Luckily, many programmes and assembling technique have been developed in the last decade, and the computational power of the cluster we can use for this project will allow to manage this outstanding amount of information properly.

References
– Rhoads A, Au KF. PacBio sequencing and its applications. Genomics, proteomics & bioinformatics. 2015 Oct 1;13(5):278-89.

Condividi:

Related

Leave a comment Cancel reply