The past, present and future (if any) of speech intelligibility metrics: A review and analysis
Seventy years ago, French and Steinberg (1947) presented a computational method that employed ‘the intensities of speech and unwanted sounds received by the ear’ to predict speech intelligibility in noise. Over the years, their articulation index has been fine-tuned; adapted to more listening condition; crystallized into a still frequently reaffirmed ANSI standard; and inspired scholars to develop new types of speech intelligibility metric. Recent literature in particular shows an avalanche of such metrics, each reporting higher correlations with intelligibility scores than their predecessors. But as a consequence, users now find it hard to choose the most appropriate metric for their application: lost in the maze created by the overwhelming number of possible variants.
This presentation will put forward a taxonomy to structure the current types of speech intelligibility metrics. A genealogical tree of types helps identify similarities among and differences between various metrics. A resulting analysis suggests a way to find the best ‘fit for purpose’ metrics and illuminate their current limitations. Because these shortcomings provide challenges and opportunities for new metrics, they may give direction to future developments.