Sonntag, 17. Januar 2010

Model adaption

Keeping in line with the last couple of blog posts that all were breakthroughs on their own, this one is definitely up there as well.

Since revision 1117 simon now supports to use static models or adapt speaker independent models to your own voice in addition to building a new, speaker dependent model from scratch (which is still the default obviously). This means that new users can set up a complete working speech recognition literally in seconds. Pick the scenarios you want, point simon to the voxforge speech model, press "Connect" and start talking.

Of course this only works if you have a fairly "standard" voice and the voxforge model is still not perfect. So if you want a little higher recognition rate go ahead, train a few samples and tell simon to adapt the voxforge base model with it. As little as one minute of speech will yield visible results (you will still need to install the HTK for this, tough).

The only user interaction needed is to click a radio button - simon will do all the work for you.



While I was at it I also improved the julius error reporting so that the recognition process now writes a log file (~/.kde/share/apps/simond/models/<user>/active/julius.log) so that you can easily debug low recognition rates, mic troubles etc. When the recognition fails completely, simon will display the log along with a short description of what simon thinks that happened.

Of course all of this is completely untested and will most likely contain bugs so try it at your own risk. By the way: Current trunk needs KDE 4.4 to compile.

1 Kommentar:

digitaler datenraum hat gesagt…

Eine gute Idee! Ich glaube, es fehlen immer Rederesourcen in Technologie. Denke gerade an die Patentierung von neuem Tool in diesem Bereich. Dass mache ich mit VDRs.