Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9030669
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T07:32:52+00:00 2026-06-16T07:32:52+00:00

I am probably thinking about this wrong but here goes. A computer starts spitting

  • 0

I am probably thinking about this wrong but here goes.

A computer starts spitting out a gazillion random numbers between 11111111111111111111 and 99999999999999999999, in a linear row:

  • Sometimes the computer adds a number to one end of the line.
  • Sometimes the computer adds a number to the other end of the line.
  • Each number has a number that comes, or will come, before.
  • Each number has a number that comes, or will come, after.
  • Not all numbers are unique, many, but not most, are repeated.
  • The computer never stops spitting out numbers.

As I record all of these numbers, I need to be able to make an educated guess, at any given time:

  • If this is the second time I have seen a number I must know what number preceded it in line last time.

  • If it has appeared more than two times, I must know the probability/frequency of numbers preceding it.

  • If this is the second time I have seen a number, I must also know what number came after it in line last time.

  • If it has appeared more than two times, I must know the probability/frequency of numbers coming after it.


How the heck do I structure the tables in a MySQL database to store all these numbers? Which engine do I use and why? How do I formulate my queries? I need to know fast, but capacity is also important because when will the thing stop spitting them out?

My ill-conceived plan:

2 Tables:

1. Unique ID/#
2. #/ID/#

My thoughts:

Unique ID’s are almost always going to be shorter than the number = faster match.
Numbers repeat = fewer ID rows = faster match initially.

Select * in table2 where id=(select id in table1 where #=?)

OR:

3 Tables:

1. Unique ID/#
2. #/ID
3. ID/#

My thoughts:

If I only need left/before, or only need after/right, im shrinking the size of the second query.

SELECT # IN table2(or 3) WHERE id=(SELECT id IN table1 WHERE #=?)

OR

1 Table:

1. #/#/#

Thoughts:

Less queries = less time.

SELECT * IN table WHERE col2=#.

I’m lost…. 🙁 Each number has four attributes, that which comes before+frequency and that which comes after+frequency.

Would I be better off thinking of it in that way? If I store and increment frequency in the table, I do away with repetition and thus speed up my queries? I was initially thinking that if I store every occurrence, it would be faster to figure the frequency programmatically…….

Such simple data, but I just don’t have the knowledge of how databases function to know which is more efficient.


In light of a recent comment, I would like to add a bit of information about the actual problem: I have a string of indefinite length. I am trying to store a Markov chain frequency table of the various characters, or chunks of characters, in this string.

Given any point in the string I need to know the probability of the next state, and the probability of the previous state.

I am anticipating user input, based on a corpus of text and past user input. A major difference compared to other applications I have seen is that I am going farther down the chain, more states, at a given time and I need the frequency data to provide multiple possibilities.

I hope that clarifies the picture a lot more. I didn’t want to get into the nitty gritty of the problem, because in the past I have created questions that are not specific enough to get a specific answer.


This seems maybe a bit better. My primary question with this solution is: Would providing the “key” (first few characters of the state) increase the speed of the system? i.e query for state_key, then query only the results of that query for the full state?

Table 1:
name: state
col1:state_id - unique, auto incrementing
col2:state_key - the first X characters of the state
col3:state - fixed length string or state

Table 2:
name: occurence
col1:state_id_left - non unique key from table 1
col2:state_id_right - non unique key from table 1
col3:frequency - int, incremented every time the two states occur next to each other.

QUERY TO FIND PREVIOUS STATES:
SELECT * IN occurence WHERE state_id_right=(SELECT state_id IN state WHERE state_key=? AND state=?)

QUERY TO FIND NEXT STATES:
SELECT * IN occurence WHERE state_id_left=(SELECT state_id IN state WHERE state_key=? AND state=?)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T07:32:54+00:00Added an answer on June 16, 2026 at 7:32 am

    I’m not familiar with Markov Chains but here is an attempt to answer the question. Note: To simplify things, let’s call each string of numbers a ‘state’.

    First of all I imagine a table like this

    Table states:
    order : integer autonumeric (add an index here)
    state_id : integer (add an index here)
    state : varchar (?)
    

    order: just use a sequential number (1,2,3,…,n) this will make it easy to search for the previous or next state.

    state_id: a unique number associated to the state. As an example, you can use the number 1 to represent the state ‘1111111111…1’ (whatever the length of the sequence is). What’s important is that a reoccurrence of a state needs to use the same state_id that was used before. You may be able to formulate the state_id based on the string (maybe substracting a number). Of course a state_id only makes sense if the number of possible states fits in a MySQL int field.

    state: that is the string of numbers ‘11111111…1’ to ‘99999999…9’ … I’m guessing this can only be stored as a string but if it fits in an integer/number column you should try it as it may well be that you don’t need the state_id

    The point of state_id is that searching number is quicker than searching texts, but there will always be trade-offs when it comes to performance … profile and identify your bottlenecks to make better design decisions.

    So, how do you look for a previous occurrence of the state S_i ?

    “SELECT order, state_id, state FROM states WHERE state_id = ” and then attach get_state_id(S_i) where get_state_id ideally uses a formula to generate a unique id for the state.

    Now, with order – 1 or order + 1 you can access the neighboring states issuing an additional query.

    Next we need to track the frequency of different occurrences. You can do that in a different table that could look like this:

    Table state_frequencies:
    state_id integer (indexed)
    occurrences integer
    

    And only add records as you get the numbers.

    Finally, you can have tables to track frequency for the neighboring states:

    Table prev_state_frequencies (next_state_frequencies is the same):
    state_id: integer (indexed)
    prev_state_id: integer (indexed)
    occurrences: integer
    

    You will be able to infer probabilities (i guess this is what you are trying to do) by looking at the number of occurrences of a state (in state_frequencies) vs the number of occurrences of it’s predecessor state (in prev_state_frequencies).

    I’m not sure if I got your problem right but if this makes sense I’m guessing I have.

    Hope it helps,
    AH

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Maybe im probably all wrong about this. Here goes.. I have the: gestureScanner =
I'm probably thinking about this all wrong but I have the following db tables:
Pretty noobish question, and I'm probably thinking about this wrong, but... Is there a
Im thinking this is correct, but probably WAY off. I have a string formatted
You're probably thinking this has been asked a million times before but I think
I don't know why I started thinking about this, but now I can't seem
I'm thinking about writing a FLAC to ALAC and/or FLAC to AAC converter, probably
I'm probably over-thinking this/wasting time trying to avoid a bit of conditional code -
This is probably just wishful thinking... Is there any way to check to see
I'm new to DDD and am thinking about using this design technique in my

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.