In .Net on a Win7 machine, I know that the System.Speech.Recognition namespace has classes capable of recognizing what was said by comparing to a list of acceptable terms or by dictation, but can it determine who said it? If so, how?
If it can’t, I’m open to other .Net libraries that would allow both recognizing the what and the who of what was said.
As far as I know, it cannot. You know that “training” you do when you set up speech regognition; those are specific to the windows user. They are referenced in the registry at HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles.
That is the recognition profile that is loaded when you start up microsoft speech. Only one profile is loaded at any given time and it is specific to the way the registry looks at the time (meaning the user logged in at the time). It cannot load up all the different profiles at once. Even if it did, the profiles are made as generic as possible. It distinguishes the person by accent. But if 2 people have similar accents, it will not be able to tell them apart.
I know of no libraries that do what you want. Such a system would require extensive training. A lot of training. Potentially hundreds of hours of training for each voice you want it to identify.