K Mer Counter Calculator
Calculate mer with our free science calculator. Uses standard scientific formulas with unit conversions and explanations.
Formula
Total k-mers = L - k + 1; Complexity = Unique k-mers / 4^k
Where L is the sequence length, k is the k-mer size, and 4^k represents all possible DNA k-mers of length k. Complexity measures the fraction of possible k-mers that actually appear in the sequence.
Frequently Asked Questions
How do I choose the right k-mer size for my analysis?
The optimal k-mer size depends on your application and organism complexity. Smaller k values (k=15-21) are useful for error correction and work well with low-coverage data, but may produce many false overlaps in repetitive genomes. Larger k values (k=31-127) improve specificity and resolve repeats better but require higher coverage and more memory. For bacterial genomes, k=21-31 often works well. For human genome assembly, k=51-101 is common. Many modern tools like SPAdes use multiple k-mer sizes simultaneously to balance sensitivity and specificity.
What does k-mer complexity or linguistic complexity mean?
K-mer complexity (also called linguistic complexity) is the ratio of observed unique k-mers to the total number of possible k-mers (4^k for DNA). A complexity of 100% means every possible k-mer of that size appears at least once. Low complexity indicates a repetitive or biased sequence. For instance, the sequence AAAAAAA has only one 3-mer (AAA), giving a complexity of 1/64 = 1.56%. This metric helps identify low-complexity regions that may confound analyses and is used by tools like DUST and RepeatMasker for masking repetitive elements.
How is k-mer counting used in genome size estimation?
K-mer frequency histograms from whole-genome sequencing data can estimate genome size without assembly. The principle is: Genome Size = Total k-mers / Peak k-mer coverage. You plot a histogram of k-mer frequencies, identify the main peak (representing single-copy regions), and divide the total number of k-mers by that peak depth. For example, if you have 3 billion total 21-mers and the coverage peak is at 30x, the estimated genome size is ~100 Mb. Tools like GenomeScope and KmerGenie automate this process and can also estimate heterozygosity and repeat content from the histogram shape.
Is my data stored or sent to a server?
No. All calculations run entirely in your browser using JavaScript. No data you enter is ever transmitted to any server or stored anywhere. Your inputs remain completely private.
What formula does K Mer Counter Calculator use?
The formula used is described in the Formula section on this page. It is based on widely accepted standards in the relevant field. If you need a specific reference or citation, the References section provides links to authoritative sources.
How do I interpret the result?
Results are displayed with a label and unit to help you understand the output. Many calculators include a short explanation or classification below the result (for example, a BMI category or risk level). Refer to the worked examples section on this page for real-world context.