ABX Listening Test Resources

Below is supplemental information for our AES paper entitled
"Statistical Analysis of ABX Results Using Signal Detection Theory"

  1. Binomial Tables
  2. Examples of binomial vs. SDT analysis
    1. Example 1
    2. Example 2
  3. Matlab Code
  4. Errata

Binomial Tables

Binomial probability mass functions
Given a specified number of trials, this table tells you the probability of randomly getting exactly a specific number of correct answers. (For example, given 21 trials, there is a 16.82% chance of randomly getting exactly 10 correct answers.)
Yellow entries are significant at a 95% confidence level
Note: you can also create your own tables in a spreadsheet program like Excel, using the function BINOMDIST(NumCorrect,NumTrials,0.5,FALSE)
Binomial cumulative distribution functions
Given a specified number of trials, this table tells you the probability of randomly getting less than or equal to a specific number of correct answers. (For example, given 21 trials, there is a 50% chance of randomly getting 10 or fewer correct answers.)
Yellow entries are significant at a 95% confidence level
Note: you can also create your own tables in a spreadsheet program like Excel, using the function BINOMDIST(NumCorrect,NumTrials,0.5,TRUE)
Results Required for 95% Confidence Level
Trials 10 20 30 40 50 60 70 80 90 100
8 14 19 25 31 36 42 47 53 58
or ≤ 1 5 10 14 18 23 27 32 36 41


Examples of binomial vs. SDT analysis

Example 1:
N = 100 trials Responses
X=A X=B
Stimuli X=A 29 11
X=B 31 29

(29+29)/100 = 58% correct
100 trials, 58 correct -> (lookup in binomial cdf table) -> 95.57% chance of randomly getting 58 or fewer correct answers. Because this value is >95%, this suggests that our result is statistically significant.

However...
HA = 29/(29+11) = 0.73;
HB = 29/(31+29) = 0.48;
FA = 31/(31+29) = 0.52;
FB = 11/(29+11) = 0.28;
d' = 1.17
bias = 0.32
d' variance = 1.44
95% confidence interval = 1.17 ± 2.35 = (-1.18, 3.52) -> Because this confidence interval intersects with zero, the perceptual difference is not statistically significant.

Note: in this case A was presented 40 times, and B was presented 60 times... resulting in a bias that affected the conclusion. (Although both stimuli are equally likely, the random assignments do not necessarily come out equal.)

Example 2:
N = 100 trials Responses
X=A X=B
Stimuli X=A 27 23
X=B 16 34

(27+34)/100 = 61% correct
100 trials, 61 correct -> (lookup in binomial cdf table) -> 98.95% chance of randomly getting 61 or fewer correct answers. Because this value is >95%, this suggests that our result is statistically significant.

However...
HA = 27/(27+23) = 0.54;
HB = 34/(16+34) = 0.68;
FA = 16/(16+34) = 0.32;
FB = 23/(27+23) = 0.46;
d' = 1.18
bias = -0.18
d' variance = 0.50
95% confidence interval = 1.18 ± 1.39 = (-0.21, 2.57) -> Because this confidence interval intersects with zero, the perceptual difference is not statistically significant.

Note: in this case, the listener was not consistent in responses to the X=A and X=B stimuli. This resulted in a large variance in our estimate of d', thus affecting the conclusion.

ABX Screenshot

Matlab Code

Matlab Code (zip)
This code provides a simple user interface for conducting ABX tests in Matlab. It analyzes the results in terms of both a binomial distribution and signal detection theory.



Errata

LSB Audio, LLC © 2008-2009   Privacy policy