ABX Listening Test Resources
Below is supplemental information for our AES paper entitled
"Statistical Analysis of ABX Results Using Signal Detection Theory"
- Binomial Tables
- Examples of binomial vs. SDT analysis
- Example 1
- Example 2
- Matlab Code
- Errata
- Binomial probability mass functions
- Given a specified number of trials, this table tells you the probability of randomly getting exactly a specific number of correct answers. (For example, given 21 trials, there is a 16.82% chance of randomly getting exactly 10 correct answers.)
- Yellow entries are significant at a 95% confidence level
- Note: you can also create your own tables in a spreadsheet program like Excel, using the function BINOMDIST(NumCorrect,NumTrials,0.5,FALSE)
- Binomial cumulative distribution functions
- Given a specified number of trials, this table tells you the probability of randomly getting less than or equal to a specific number of correct answers. (For example, given 21 trials, there is a 50% chance of randomly getting 10 or fewer correct answers.)
- Yellow entries are significant at a 95% confidence level
- Note: you can also create your own tables in a spreadsheet program like Excel, using the function BINOMDIST(NumCorrect,NumTrials,0.5,TRUE)
- Results Required for 95% Confidence Level
| Trials
| 10 |
20 |
30 |
40 |
50 |
60 |
70 |
80 |
90 |
100 |
| ≥ |
8 |
14 |
19 |
25 |
31 |
36 |
42 |
47 |
53 |
58 |
| or ≤ |
1 |
5 |
10 |
14 |
18 |
23 |
27 |
32 |
36 |
41 |
Example 1:
| N = 100 trials |
|
Responses
|
|
|
X=A
| X=B
|
| Stimuli
| X=A
| 29 |
11 |
| X=B
| 31 |
29 |
(29+29)/100 = 58% correct
100 trials, 58 correct -> (lookup in binomial cdf table) -> 95.57% chance of randomly getting 58 or fewer correct answers. Because this value is >95%, this suggests that our result is statistically significant.
However...
H
A = 29/(29+11) = 0.73;
H
B = 29/(31+29) = 0.48;
F
A = 31/(31+29) = 0.52;
F
B = 11/(29+11) = 0.28;
d' = 1.17
bias = 0.32
d' variance = 1.44
95% confidence interval = 1.17 ± 2.35 = (-1.18, 3.52) -> Because this confidence interval intersects with zero, the perceptual difference is
not statistically significant.
Note: in this case A was presented 40 times, and B was presented 60 times... resulting in a bias that affected the conclusion. (Although both stimuli are equally likely, the random assignments do not necessarily come out equal.)
Example 2:
| N = 100 trials |
|
Responses
|
|
|
X=A
| X=B
|
| Stimuli
| X=A
| 27 |
23 |
| X=B
| 16 |
34 |
(27+34)/100 = 61% correct
100 trials, 61 correct -> (lookup in binomial cdf table) -> 98.95% chance of randomly getting 61 or fewer correct answers. Because this value is >95%, this suggests that our result is statistically significant.
However...
H
A = 27/(27+23) = 0.54;
H
B = 34/(16+34) = 0.68;
F
A = 16/(16+34) = 0.32;
F
B = 23/(27+23) = 0.46;
d' = 1.18
bias = -0.18
d' variance = 0.50
95% confidence interval = 1.18 ± 1.39 = (-0.21, 2.57) -> Because this confidence interval intersects with zero, the perceptual difference is
not statistically significant.
Note: in this case, the listener was not consistent in responses to the X=A and X=B stimuli. This resulted in a large variance in our estimate of d', thus affecting the conclusion.
- Matlab Code (zip)
- This code provides a simple user interface for conducting ABX tests in Matlab. It analyzes the results in terms of both a binomial distribution and signal detection theory.
- The equation for FB (at the top of page 6) should be FB = 20/50 = 0.4 (case 2)
- Equation 2 assumes that X=A and X=B are presented an equal number of times (NA=NB=N/2).
If we don't make this assumption, the equation is:

Alternatively, you could use the tables in this paper (for the duo-trio method):
Bi, J., Ennis, D. M., & O'Mahony, M. How to estimate and use the variance of d' from difference tests. Journal of Sensory Studies, 12, 87-104. 1997.
LSB Audio, LLC © 2008-2009 Privacy policy