You might not be surprised to learn that most speech recognition
systems are designed with adult speakers in mind. To date, the nuances
and idiosyncrasies of children’s speech have rarely been built into
speech-driven applications for children’s use, rendering them unable to
successfully process interactions with a younger audience.
For one leading multinational technology company, this was the precise
situation which needed to be addressed. The business had discovered
that its speech recognition system, originally trained with adult speech
data, had not taken into account all of the differences in how children
speak, making it ineffective for use in applications designed for children.
Children typically speak with higher-pitch formant frequencies, and
greater temporal and spectral variability. Irregularities, hesitations, and
mispronunciations abound (for example “uh”, “um” and “fwoggy” instead