Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6557767
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T13:06:59+00:00 2026-05-25T13:06:59+00:00

I am working on a time series of numeric values, such as those produced

  • 0

I am working on a time series of numeric values, such as those produced by a temperature sensor. I’d like to filter those values, roughly selecting those samples that form e.g. the top 10% of the received values.

The obvious solution of recording all samples and using any well-known algorithm for the extraction of the k-highest values is not possible in my case for two reasons:

  • The series may be infinite, memory is definitely not.

  • I’d like this algorithm to be usable in real-time, or at least in a streaming mode with predetermined latency.

The distribution of the values is not normal, nor is it consistent with any well-known distribution that I know of. Metrics that I already have available at any time include the mean, the variance and the skewness of the values that have already been received.

Unlike this question, I do not need perfect accuracy, although I would like to be able to tune the parameters of the selection algorithm.

I believe something similar is used in single-pass variable bit-rate (VBR) media codecs to allocate the available bandwidth to each frame, by determining the number of available bits. Unfortunately all the VBR algorithms I studied are too focused on DSP and media streams for me to understand and/or implement.

Are there any known algorithms that could help me deal with this issue? Any hints that would orient me towards the right direction would be greatly appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T13:06:59+00:00Added an answer on May 25, 2026 at 1:06 pm

    If you decide you are only interested in the last 10N items you can use two heaps, one of size N and one of size 9N, to keep track of the N highest items in the last 10N. When you see a new item first remove the oldest item. If it came from the small heap, take the largest item from the large heap and put it in the small. Now look at the new item and either put it straight in the large heap or take the smallest item from the small heap and transfer it to the large one before putting the new item in the small heap.

    At any time you have a small heap full of high items and a large heap full of low items, and you know whether the latest item was in the top 10% of those 10N.

    But is this really what you want? Note that if your samples rise steadily and then fall steadily over a period of time much larger than your 10N samples then nearly half the time the latest item will be in the top 10% – in fact it will be the largest item seen in the memory of 10N items.

    There have been academic papers on finding approximaste quantiles of streaming data. One such is “Effective Computation of Biased Quantiles over Data Streams”, by Cormode, Korn, Muthukrishnan, and Srivastava

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an irregular time series that I'm working with that I'd like to
Hye Guys, I'm busy working on a time series and am trying to find
I'm working with very long time series -- hundreds of millions of data points
I would like to import a time-series where the first field indicates a period:
I am working on time series data, for which the key column is a
I am working on a time series based calculation. Each iteration of the calculation
I'm having a hard time working with arrays. Coming from AS2/AS3 and garbage collection
This is my first time working with file i/o in java, and it's not
I am having an problem. First time working with a windows server, do you
We're building a very content rich site in Drupal. It's my first time working

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.