As you may have noticed in the previous posts 1-3 on Bombina pachypus sequencing, the protocols use short read sequencing approaches to genotype the fragments produced by the library step. These approaches are useful as they provide high amount of very accurate sequences, but at the same time the short length doesn’t allow to be fully aware of the order of fragments in the genome, nor of the phase of different fragments (OMNI-C approach provides some information in this sense, but at a wider scale). It is therefore fundamental to implement a third sequencing effort that provides longer reads to be used as a reference onto which anchor the shorter reads. For this purpose, we will use PacBio technology, that exploits a novel “real-time” nucleotide incorporation monitoring to produce reads 50+ kB long

At the end of the sequencing steps, all these different datasets produced by different approaches have to be combined. And the bioinformatician will start to work, assembling (putting together in the correct order) billions of pieces. Luckily, many programmes and assembling technique have been developed in the last decade, and the computational power of the cluster we can use for this project will allow to manage this outstanding amount of information properly.
References
– Rhoads A, Au KF. PacBio sequencing and its applications. Genomics, proteomics & bioinformatics. 2015 Oct 1;13(5):278-89.