AI Watermark Detector Probability Calculator

Name: AI Watermark Detector Probability Calculator
Availability: InStock
Author: Daniel Agrici

Estimate the probability of AI-generated text detection from text length and watermark strength.

Reviewed by Daniel Agrici, Founder & Lead Developer

Formula

P(detect) = Phi(z), where z = (n*p_w - n*gamma) / sqrt(n*gamma*(1-gamma))

Detection probability is computed using a z-test. The boosted green-list probability p_w = gamma + (1-gamma)(1-e^(-delta/T)), where gamma is the green list fraction, delta is watermark strength, and T is temperature. The z-score measures how many standard deviations the observed green token count exceeds the baseline expectation.

Worked Examples

Example 1: Short Email Detection

Problem:An AI-generated email contains 100 tokens with watermark strength delta=2.0, gamma=0.5, and temperature=1.0. What is the detection probability?

Solution:Green list fraction (gamma): 0.5\nBoosted probability: 0.5 + 0.5 x (1 - e^(-2.0)) = 0.5 + 0.5 x 0.8647 = 0.9323\nExpected green tokens (watermarked): 100 x 0.9323 = 93.2\nBaseline green tokens: 100 x 0.5 = 50\nZ-score: (93.2 - 50) / sqrt(100 x 0.5 x 0.5) = 43.2 / 5 = 8.65\nDetection probability: ~100%

Result:Detection: ~100% | Z-score: 8.65 | Very high confidence even for short text

Example 2: Weak Watermark on Long Essay

Problem:A 2000-token essay has a weak watermark (delta=0.5, gamma=0.5, temp=1.0). Can it still be detected?

Solution:Boosted probability: 0.5 + 0.5 x (1 - e^(-0.5)) = 0.5 + 0.5 x 0.3935 = 0.6967\nExpected green tokens: 2000 x 0.6967 = 1393.5\nBaseline: 2000 x 0.5 = 1000\nZ-score: (1393.5 - 1000) / sqrt(2000 x 0.25) = 393.5 / 22.36 = 17.60\nDetection probability: ~100%

Result:Detection: ~100% | Z-score: 17.60 | Long text compensates for weak watermark

Frequently Asked Questions

What factors affect AI watermark detection probability?

Several key factors determine how reliably an AI watermark can be detected. Text length is the most important factor, as longer texts provide more tokens for statistical analysis and stronger detection signals. The watermark strength parameter (delta) controls how aggressively green list tokens are boosted, with higher values producing easier detection but potentially degrading text quality. The green list fraction (gamma) determines what portion of the vocabulary receives the bias, with 0.5 being typical. Temperature during generation also matters because lower temperatures already concentrate probability mass on fewer tokens, making the watermark less effective. Finally, any post-generation editing, paraphrasing, or translation by humans reduces the watermark signal proportionally to how many tokens are modified.

References

Reviewed by Daniel Agrici, Founder & Lead Developer · Editorial policy