Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7183813
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T18:04:20+00:00 2026-05-28T18:04:20+00:00

i have next problem in my application, i write app where someone will write

  • 0

i have next “problem” in my application, i write app where someone will write text, SAPI TTS translate it in speech and next i will work with the output WAV.
What i need are information about phonemes (where in the output WAV is some phoneme, how long voice say it, etc)..
ok, i used SpVoice.Phoneme() and i added handler for phonemes. Ok, now i can get duration etc..but in SpVoice.Phoneme() is attribute StreamPosition but i have not idea what that means..

from MSDN:

StreamPosition
The character position in the output stream at which the phoneme begins.

I dont understand if they mean “byte” position in output WAV (on WHICH byte is the phoneme)..or millisecond time in output WAV..or what that mean??

For example, for text:

This is high. This is low. This is fast. This is slow.

I get the StreamPositions values:

Position:0
Position:120
Position:2562
….
Position:143798
Position:147874
Position:151950

The output WAV file have 5.377098seconds and last phoneme “ow” is told circa in 4.734s.
The output WAV file have 237 568bytes.. So the value of attribute StreamPosition “147874” is probably not the byte on which begin the phoneme. The same for “timing” (in ms because WAV have 5.3s but 151950ms is 151,950s..so this is closed..).

So what is the StreamPosition? (what means the value in StreamPosition?)

I really need catch exactly time when the phoneme begin. I tried it with DateTime.Now.Ticks/10000. When user click on button for start translating TTS i save this datetime value and when some handler catch some phoneme i catch the value again. And then i will get the value with currTime-startTime. But this “method” is not so exact. There are always some divergency. Have SpVoice.Phoneme() some “method” or something to get exactly information about the time when phoneme began?
If not, is there some better way to get exactlier time in ms?

sry for my english and really thanks for all answers and advices..

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T18:04:21+00:00Added an answer on May 28, 2026 at 6:04 pm

    ok, i will answer myself.. My bachelors profesor sended me some code in C++ what he wrote.. I readed it last 2days and now i see how stupid I am.

    so i will answer..

    attribute StreamPosition is really “bites” position in the output stream (probably WAV).

    If you want to know millisecond position in the output stream, you need write something like:

    (int)StreamPosition/(double)wavFileFormat_samplesPerSec/((double)wavFileFormat_BitsPerSample/8)

    so you need find information about the outputStream like bitsPerSample, SamplesPerSec and you will get the milliseconds timing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the next problem: I need to process only 1 request at a
I have the next problem. In one of the cells (UITableViewCell) I need to
I'm trying to get a regexp to solve the next problem: I have some
I have the following problem: We need to find the next august. I other
I have a small problem where I want to find the next active item
I have the following problem: list.c struct nmlist_element_s { void *data; struct nmlist_element_s *next;
Greetings, I have problem with errorPlacement, I'm trying to place the error message next
My problem is simple in nature- query a database, and write output to a
I have this problem, I wrote a WinForms application, I have use it for
I have a problem with socket send (or write) function on android. There is

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.