Speech Recognition API
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Speech technology has the potential to change the way in which we interact with computers and other technical devices. This paper describes the work involved in creating a speech recognition API.
A speech recognition API is a user interface for computer programmers who want to create applications employing speech recognition. Particularly one application, the so called AudioBrowser, is referred to in this report. The AudioBrowser is an application which enables remote access to the World Wide Web by using speech technology. To show how an AudioBrowser could interact with the user, and to evaluate the performance of the speech recognition system, two demos were implemented.
An appropriate platform had to be deployed to be able to run speech technology applications on a PC or a server. The hardware and software used for this platform are described as well as problems encountered when building the platform.
This paper also gives a general background to speech recognition and to various speech recognition systems and software solutions.
The considerations that had to be made when designing the API are discussed, as well as the software implementation of the API.
The API was supposed to support applications running on the VoiceServer, which is a server implementing various telephony and speech technology services. For this reason the API had to conform to the general software architecture of the VoiceServer. This software architecture and the final API are described at the end of this report.
Place, publisher, year, edition, pages
1997. , 68 p.
IdentifiersURN: urn:nbn:se:kth:diva-96663OAI: oai:DiVA.org:kth-96663DiVA: diva2:531917
Subject / course
Master of Science in Engineering - Electrical Engineering
1997-08-27, Seminar room "Telegrafen", Isafjordsgatan 22, Kista, 09:00 (English)
Maguire Jr., Gerald Q., ProfessorMellor, Paul
Maguire Jr., Gerald Q., professor