Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6641509
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T23:49:10+00:00 2026-05-25T23:49:10+00:00

I have an application in which I am analyzing a system where there are

  • 0

I have an application in which I am analyzing a system where there are a large number of interactions. And I need to make certain choices based on the frequency of the occurrences of unique items in the system. For example, if you had this list of letters:

A, B, F, G, A, T, S, B, S, B, S, Q, Z, B, Q, S

Here is a list showing how often each letter occurs (occurrences):

A - 2
B - 4
F - 1
G - 1
Q - 2
T - 1
S - 4
Z - 1

So the frequency of the occurrences are as such (occurrence occurrences):

4 - 2
2 - 2
1 - 4

The above is a tiny example, but I’ve attached an image which is a simple line graph of a larger system

series graph

In this graph the numbers along the bottom aren’t really important. They are just marking the number of unique frequencies. And the Y-axis marks the value of that frequency.

What I’m looking for is a mathematical/programmatic way to find the point where that line begins to break upwards. My searches haven’t yielded what I’m looking for as I’m not really sure what the proper terminology is, or the name of the concept.

Right now, we have to manually choose that point based on a human looking at the numbers and saying “here”. But I want to, at the very least, already have a “recommended” value chosen, and at the most, be able to remove the human component completely.

For clarification, my current algorithm is producing a list of number pairs occurrence to occurrence frequency. My use of the word “frequency” in no way relates to electromagnetic signals, but rather to how often an occurrence occurs. But I thought that saying “occurrence occurrences” would be more confusing!

In this system, the general trend is that a few entities will show up in a large number of interactions, more entities will show up in a medium number of interactions, but the greatest number of entities will show up in just a few, or even no, interactions. It would be tough to imagine a scenario where it was different than that… worst case would probably be a plateau. But there could definitely be a dip after a jump at any point from the beginning to the end. The illustration above just doesn’t show that. We cannot assume that there will be a point where it will begin to rise with no drops afterwards.

Here is my data. (The simple graph above was produced with the Occurrence Frequency column data only):

data for graph

This list, as you can see, is sorted in descending order on the occurrence column. This is from a small system with 904 unique entities. Those entities have 38 unique occurrence rates. If you started at the top of this list, you could say:

"2 entities occur 309 times"
"1 entity occurs 130 times"
etc.

Ultimately what I’m trying to determine is the importance of an entity based on how often it occurs in the system. I need to be able to flag certain items as “important”, but all items can’t be important. And the method/algorithm I’m looking for would help to identify at what point in that list do I stop considering items important.

If you look at the list, you can see where the lower occurrences start becoming more frequent. I don’t think that I can sort on the right column because the left column is really the key data. Greater occurrences = more importance.

But I still need to figure out how to determine that.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T23:49:11+00:00Added an answer on May 25, 2026 at 11:49 pm

    Is there any reason the larger example isn’t sorted? If you sort it by increasing Y values, then you can take the slope of each consecutive pair, and call the breakpoint where the slope changes significantly.

    You can tweak the rules for “changes significantly” to meet your exact needs. It might be as simple as “the slope that increase most compared to the previous”, or “the first slope that varies more than X% from the running average slope”. Or maybe the largest rss of the differences between the slope at the test point and the one before and the one after.


    After the edit, I think it may be as simple as taking a percentage. Multiply each X and Y, and take the sum over all entries. That’s the total number of events observed. Now start from the bottom if your table, and start subtracting each row’s product from the total until you get to less than X% of the original total. What you are left with is the “significant” events that contributed most to the total.

    I have a feeling this is a common problem in statistics, but I don’t have enough background to say what the proper terminology is, although standard deviations come to mind.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have application which consumes XML and based on this creates a GUI. Basically
I have an application which uses an im-memory implementation of Queue. I need to
I have application which needs to use a dll (also written by me) which
I have an application which really should be installed, but does work fine when
I have an application which extracts data from an XML file using XPath. If
We have an application which needs to use Direct3D. Specifically, it needs at least
I have an application which behaves as a slideshow for all pictures in a
I have an application which is a portal application and I want to allow
I have an application which takes a string value of the form %programfiles%\directory\tool.exe from
I have an application which get copied and run on client machines. The program

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.