A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



Genotype concordance between Illumina, Complete Genomics and BGI-seq platforms

less than 1 minute read


Based on PCA, an overview of the genotype concordance between Illumina ultra high-coverage (~200x), Illumina high-coverage (~30x), Illumina low-coverage (~7x), Complete Genomics high-coverage (versions 2.0 and 2.2, from ~30x to ~87x) and BGIseq-500 high-coverage (PE100, ~37x) platforms/sequencing-runs. For this purpose I used 12 reference samples for which there are available data in at least two or more of the previous technologies.

Differences between GATK best practices with and without using free-of-reference bias priors

3 minute read


In the supplementary section 1 of the Simons Genome Diversity Project, Mallik et al. introduce the use of non-standard (not part of the best practises and default setting of GATK) snp calling priors. The default snp calling priors in GATK they write, “have built-in priors for Bayesian SNP calling that assume that the site is more likely to be homozygous for the reference allele than homozygous for the variant allele. When there is ambiguity in a heterozygous, GATK prefers the reference homozygous. This is a reference bias, and while this bias is not typically problematic for medical studies*, it can complicate interpretation of population genetics signals … “