Skip to main content

Unique Word Counter

Count unique words and calculate vocabulary richness (type-token ratio) in text. Enter values for instant results with step-by-step formulas.

Skip to calculator
Language & Writing

Unique Word Counter

Count unique words and calculate vocabulary richness (type-token ratio) in any text. Analyze word frequency, hapax legomena, and writing complexity.

Last updated: December 2025

Calculator

Adjust values & calculate
Understand the Math

Formula

TTR = Unique Words (Types) / Total Words (Tokens)

The Type-Token Ratio divides the count of distinct words by the total word count. A ratio closer to 1.0 indicates higher vocabulary diversity. Hapax legomena count measures words appearing exactly once.

Last reviewed: December 2025

Worked Examples

Example 1: Analyzing a Short Paragraph

Analyze: 'The quick brown fox jumps over the lazy dog. The dog barked at the fox.'
Solution:
Total words: 15 Unique words: 10 (the, quick, brown, fox, jumps, over, lazy, dog, barked, at) Repeated: 'the' (3x), 'fox' (2x), 'dog' (2x) Type-Token Ratio = 10/15 = 0.667 Hapax Legomena = 7 (quick, brown, jumps, over, lazy, barked, at) Vocabulary richness: High
Result: 10 unique words out of 15 total, TTR = 0.667 (High richness)

Example 2: Comparing Two Writing Samples

Sample A: 'I like cats. I like dogs. I like birds.' vs Sample B: 'Felines prowl gracefully while canines frolic and songbirds chirp melodiously.'
Solution:
Sample A: 7 total, 5 unique (I, like, cats, dogs, birds), TTR = 0.714 Sample B: 8 total, 8 unique, TTR = 1.000 Despite Sample A having high TTR, Sample B demonstrates superior vocabulary diversity with zero repetition and more sophisticated word choices.
Result: Sample B has perfect TTR of 1.0 vs Sample A at 0.714
Expert Insights

Background & Theory

The Unique Word Counter applies the following established principles and formulas. Language and writing calculators quantify the clarity, complexity, and accessibility of text through formulas derived from empirical studies of reading comprehension. The Flesch-Kincaid Grade Level formula, the most widely adopted readability metric, is calculated as 0.39 multiplied by average sentence length in words, plus 11.8 multiplied by average syllables per word, minus 15.59. The result approximates the US school grade level required to understand the text comfortably. A score of 8 indicates eighth-grade readability; most major newspapers target a score between 7 and 9 for broad audience accessibility. The related Flesch Reading Ease score inverts the scale: higher scores (60-70) indicate easy reading, while scores below 30 characterise academic and professional texts. The Gunning Fog Index offers an alternative by counting the percentage of words with three or more syllables (complex words) and weighting them more heavily, using the formula 0.4 multiplied by the sum of average sentence length and the percentage of polysyllabic words. Reading time estimation assumes an average adult silent reading speed of 200-250 words per minute, though skilled readers reach 300 wpm and speed reading techniques claim 500 or more. Practical calculators use 238 wpm as a median, dividing total word count by this figure to produce minutes of reading time. Zipf's Law describes a universal property of natural language: the frequency of any word is inversely proportional to its rank in the frequency table. The most common word in English (the) appears roughly twice as often as the second most common word, three times as often as the third, and so on. This power-law distribution informs corpus analysis, text generation models, and translation cost estimation. Professional translation is priced per source word with rates varying by language pair, subject matter, and turnaround time, typically ranging from $0.07 to $0.25 per word. Plagiarism detection tools compute similarity percentages by identifying matching text sequences against indexed sources.

History

The history behind the Unique Word Counter traces back through the following developments. Writing systems emerged independently in multiple civilisations. The Phoenician alphabet, developed around 1050 BCE on the eastern Mediterranean coast, is the direct ancestor of Greek, Latin, Arabic, and Hebrew scripts, and through them virtually all modern alphabetic writing systems. Its innovation was the reduction of writing to a small set of consonantal symbols representing sounds rather than words or syllables, dramatically lowering the literacy acquisition barrier. Johannes Gutenberg's development of movable type printing around 1440 in Mainz made text reproduction economically practical for the first time, reducing the cost of books by roughly 80% over the following century. The resulting explosion in text production created a demand for standardised spelling and grammar that had not previously existed, since manuscript copyists had freely varied orthography. Dictionary standardisation arrived in the 18th century. Samuel Johnson's Dictionary of the English Language (1755) provided the first comprehensive attempt to record and stabilise English vocabulary. Noah Webster's An American Dictionary of the English Language (1828) extended this project to American English while deliberately introducing spelling differences that distinguished American from British usage. Ludwig Lazarus Zamenhof published the first grammar of Esperanto in 1887 under the pseudonym Doktoro Esperanto, attempting to create a politically neutral international auxiliary language. Esperanto remains the most widely spoken constructed language with an estimated one to two million speakers. The University of Chicago Press published the first edition of the Chicago Manual of Style in 1906, providing editorial and citation standards that became authoritative across American academic and publishing industries. Corpus linguistics developed through the mid-20th century as researchers compiled large text databases to study language statistically rather than through idealised introspection. Computational spell-checkers became commercially available in the late 1970s. Grammar checkers followed in the 1980s. The transformer architecture introduced in the 2017 paper Attention Is All You Need enabled large language models that by 2022 could generate fluent text, check grammar, estimate readability, and assist with writing at a level that fundamentally altered assumptions about writing assistance tools.

Share this calculator

Explore More

Frequently Asked Questions

A unique word count measures the number of distinct words in a text, regardless of how many times each word appears. For example, the sentence 'the cat sat on the mat' has 6 total words but only 5 unique words because 'the' appears twice. This metric is essential for writers, linguists, and content creators because it reveals vocabulary diversity and writing complexity. A higher unique word count relative to total words indicates richer vocabulary usage. Academic papers typically have higher vocabulary richness than casual blog posts. Content marketers use this metric to ensure their copy is varied and engaging rather than repetitive and monotonous.
Divide word count by your speaking rate. Average conversational speech: 130–150 wpm. Presentations and public speaking: 120–150 wpm. Fast speaking: 160–180 wpm. A 10-minute speech at 130 wpm needs about 1,300 words; at 150 wpm, about 1,500 words. Practice delivery at your natural pace and measure actual time to calibrate.
Academic word count conventions vary by institution and level: undergraduate essays typically run 1,500–3,000 words, final-year dissertations 8,000–12,000 words, and master's theses 15,000–25,000 words. A PhD thesis in the UK is capped at 80,000 words by most universities (excluding references); US doctoral dissertations average 60,000–100,000 words. Abstracts are typically 150–300 words, and conference papers 5,000–8,000 words. When a word limit is given, the standard tolerance is ±10% — staying within this range ensures compliance without padding or excessive cutting.
You may use the results for reference and educational purposes. For professional reports, academic papers, or critical decisions, we recommend verifying outputs against peer-reviewed sources or consulting a qualified expert in the relevant field.
All calculations use established mathematical formulas and are performed with high-precision arithmetic. Results are accurate to the precision shown. For critical decisions in finance, medicine, or engineering, always verify results with a qualified professional.
No. All calculations run entirely in your browser using JavaScript. No data you enter is ever transmitted to any server or stored anywhere. Your inputs remain completely private.
Educational Note: This calculator is provided for educational and informational purposes. Results are based on the formulas and inputs provided. Always verify important calculations independently. NovaCalculator processes calculator inputs client-side; optional analytics follow visitor consent settings. © 2024–2026 NovaCalculator.

Share this calculator

Formula

TTR = Unique Words (Types) / Total Words (Tokens)

The Type-Token Ratio divides the count of distinct words by the total word count. A ratio closer to 1.0 indicates higher vocabulary diversity. Hapax legomena count measures words appearing exactly once.

Worked Examples

Example 1: Analyzing a Short Paragraph

Problem: Analyze: 'The quick brown fox jumps over the lazy dog. The dog barked at the fox.'

Solution: Total words: 15\nUnique words: 10 (the, quick, brown, fox, jumps, over, lazy, dog, barked, at)\nRepeated: 'the' (3x), 'fox' (2x), 'dog' (2x)\nType-Token Ratio = 10/15 = 0.667\nHapax Legomena = 7 (quick, brown, jumps, over, lazy, barked, at)\nVocabulary richness: High

Result: 10 unique words out of 15 total, TTR = 0.667 (High richness)

Example 2: Comparing Two Writing Samples

Problem: Sample A: 'I like cats. I like dogs. I like birds.' vs Sample B: 'Felines prowl gracefully while canines frolic and songbirds chirp melodiously.'

Solution: Sample A: 7 total, 5 unique (I, like, cats, dogs, birds), TTR = 0.714\nSample B: 8 total, 8 unique, TTR = 1.000\nDespite Sample A having high TTR, Sample B demonstrates superior vocabulary diversity with zero repetition and more sophisticated word choices.

Result: Sample B has perfect TTR of 1.0 vs Sample A at 0.714

Frequently Asked Questions

What is a unique word count and why does it matter?

A unique word count measures the number of distinct words in a text, regardless of how many times each word appears. For example, the sentence 'the cat sat on the mat' has 6 total words but only 5 unique words because 'the' appears twice. This metric is essential for writers, linguists, and content creators because it reveals vocabulary diversity and writing complexity. A higher unique word count relative to total words indicates richer vocabulary usage. Academic papers typically have higher vocabulary richness than casual blog posts. Content marketers use this metric to ensure their copy is varied and engaging rather than repetitive and monotonous.

How is speech time calculated from word count?

Divide word count by your speaking rate. Average conversational speech: 130–150 wpm. Presentations and public speaking: 120–150 wpm. Fast speaking: 160–180 wpm. A 10-minute speech at 130 wpm needs about 1,300 words; at 150 wpm, about 1,500 words. Practice delivery at your natural pace and measure actual time to calibrate.

What are standard word count requirements for academic writing?

Academic word count conventions vary by institution and level: undergraduate essays typically run 1,500–3,000 words, final-year dissertations 8,000–12,000 words, and master's theses 15,000–25,000 words. A PhD thesis in the UK is capped at 80,000 words by most universities (excluding references); US doctoral dissertations average 60,000–100,000 words. Abstracts are typically 150–300 words, and conference papers 5,000–8,000 words. When a word limit is given, the standard tolerance is ±10% — staying within this range ensures compliance without padding or excessive cutting.

How accurate are the results from Unique Word Counter?

All calculations use established mathematical formulas and are performed with high-precision arithmetic. Results are accurate to the precision shown. For critical decisions in finance, medicine, or engineering, always verify results with a qualified professional.

How do I interpret the result?

Results are displayed with a label and unit to help you understand the output. Many calculators include a short explanation or classification below the result (for example, a BMI category or risk level). Refer to the worked examples section on this page for real-world context.

Can I use Unique Word Counter on a mobile device?

Yes. All calculators on NovaCalculator are fully responsive and work on smartphones, tablets, and desktops. The layout adapts automatically to your screen size.

References

Reviewed by Daniel Agrici, Founder & Lead Developer · Editorial policy