Free Tool · CRO

A/B Test Significance Calculator

Enter the visitors and conversions for your control and variant to see conversion rates, uplift and whether the result is statistically significant. Two-proportion z-test, no email.

Your test result
Control rate
Variant rate
Relative uplift
Confidence
Enter your control and variant numbers to test significance.
Turn winning tests into pipeline → No commitment · 30-min growth strategy call · B2B SaaS specialists

What is A/B test significance?

Statistical significance tells you whether the difference between your control and variant is real or just random noise. An A/B test can show a higher conversion rate for the variant purely by chance; significance testing estimates how confident you can be that the lift is genuine. The common bar is 95% confidence, meaning only a 5% chance the result is a fluke.

How this calculator works

The method (two-proportion z-test):

p1 = control conversions ÷ control visitors
p2 = variant conversions ÷ variant visitors
pooled p = (c1 + c2) ÷ (n1 + n2)
standard error = √( pooled p × (1 − pooled p) × (1/n1 + 1/n2) )
z = (p2 − p1) ÷ standard error
confidence = probability from the normal distribution (two-tailed)

Relative uplift is (variant rate − control rate) ÷ control rate. A result is called significant when confidence reaches 95% or more.

How to read the result

ConfidenceRead
Under 90%Not significant; keep the test running
90 to 95%Promising, not conclusive
95% or higherStatistically significant at the standard bar

Common A/B testing mistakes

  • Stopping early. Confidence fluctuates; peeking and stopping at the first 95% inflates false positives.
  • Too-small samples. Low-conversion B2B pages often need thousands of visitors per variant.
  • Ignoring the uplift. A significant but tiny lift may not be worth shipping. Read confidence with the effect size.
  • Short windows. Run at least one to two full business cycles to avoid day-of-week bias.

Frequently asked questions

How do you calculate A/B test significance?

With a two-proportion z-test: compute each variant's rate, the pooled standard error, and a z-score for the difference, then convert to a confidence level via the normal distribution. 95% confidence (p ≤ 0.05) is the common significance threshold.

What does statistically significant mean?

The difference is unlikely to be random. At 95% confidence there is a 5% chance it happened by luck. It does not mean the effect is large, so read it with the uplift.

How big a sample size do I need?

It depends on baseline rate and the uplift you want to detect; smaller effects need much larger samples. Low-conversion B2B pages often need thousands per variant, run over full business cycles.

Why should I not stop a test early?

Confidence fluctuates, so a test can cross 95% by chance then fall back. Decide sample size and duration in advance and let it run to avoid false positives.