Boquila: NGS read simulator to eliminate read nucleotide bias in sequence analysis
Sequence content is heterogeneous throughout genomes. Therefore, genome-wide next-generation sequencing (NGS) reads biased towards specific nucleotide profiles are affected by the genome-wide heterogeneous nucleotide distribution. Boquila generates sequences that mimic the nucleotide profile of true reads, which can be used to correct the nucleotide-based bias of genome-wide distribution of NGS reads. Boquila can be configured to generate reads from only specified regions of the reference genome. It also allows the use of input DNA sequencing to correct the bias due to the copy number variations in the genome. Boquila uses standard file formats for input and output data, and it can be easily integrated into any workflow for high-throughput sequencing applications.