I
Speak, therefore I am by
Hatim Kantawalla
Test Process
Voice recognition software has always been given closer scrutiny
since a lot is demanded and expected from them. Accuracy is of utmost
importance along with the time it takes for recognition. It is only
continuous and narrative type of speech recognition which is considered
to be useful. The days of discreet speech recognition programs are
over and hence, all the tests were carried out without any pauses
in between.
Added to these challenges are the enormous variance that already
exists among individual human speech patterns, pitch, rate and inflection.
These variations are an extraordinary test of the flexibility of
any program. Naturally, the products were put through a stringent
test which tested all the features.
Test Bench
The Test Setup consisted of an average system comprising an Intel
Pentium III 500 MHz coupled with 128 MB of PC-100 SDRAM and a 8.4
GB Seagate U8 hard disk drive on an Intel SE-440BX2 platform. The
soundcard used was the Creative Sound Blaster Live! Value.
Accuracy
The test for accuracy in predicting words and the way the software
learns the style, accent and phonetics was done through the narration
of a carefully prepared paragraph. This paragraph contained a decent
amount of technical jargon as well as common and widely used words.
While the software was decoding the speech into text, we observed
the CPU utilisation and also the total amount of time the software
took initially to train itself.
After the first time that we read out the narrative, the technical
jargon which the software couldn't recognise was added to the vocabulary
of the software. Now, the software was again tested with the same
narrative to see whether it could now detect all of these words.
We used the same narrative a third time around to eliminate problems
due to improper narration and measured the accuracy of the program.
|