A/B testing is the gold standard for making data-driven decisions in product development and marketing. However, a common pitfall in the business world is “peeking”—checking results too early and stopping a test as soon as it looks significant. This leads to false positives and decisions based on noise rather than signal.
To address this, I developed a Tableau Public Dashboard that not only tracks the performance of control vs. treatment groups but also rigorously calculates Statistical Significance and Minimum Sample Size natively within Tableau.
Unlike many Tableau solutions that rely on external R or Python scripts, this dashboard performs all statistical calculations using native Tableau Calculated Fields. This ensures the dashboard is fast, portable, and easy to maintain.
Before analyzing the results, it is crucial to determine how much data is needed. If you stop a test before reaching this threshold, your results may not be reliable. I implemented the sample size formula for two independent proportions based on the methodology from Select Statistical Consultants.
The formula requires:
The dashboard dynamically calculates the required n (sample size per group) to ensure the test has enough power to detect the MDE.
To determine if the difference between the Control (A) and Treatment (B) groups is real, I employed a Z-test for Two Independent Proportions.
Following the statistical framework from Penn State University, the dashboard calculates:
If the Z-Score exceeds the critical value (e.g., 1.96 for 95% confidence) the dashboard flags the result as statistically significant.
While resources like Playfair Data’s Guide offer excellent introductions to Z-tests in Tableau, my approach adds critical safeguards for business users:
For businesses, this tool bridges the gap between raw data and statistical rigor. It provides accurate A/B testing by: