I’ve got a series of items in a MySQL database. Each item has four

Question

0

Asked: May 31, 20262026-05-31T06:45:43+00:00 2026-05-31T06:45:43+00:00

I’ve got a series of items in a MySQL database. Each item has four

0

I’ve got a series of items in a MySQL database.

Each item has four characteristics associated with it:

abs_left (The items left most position)

abs_center (The items center position)

abs_right (The items right most position)

row (The items vertical position)

Within a chunk of data I know that the items are aligned in columns, but I do not know how many columns there are. The numbers of abs_left, abs_center, and abs_right are also not precise, and vary a pretty significant amount (e.g. The abs_right of one item might slightly overlap the abs_left of another item, with them being in different columns). The items row, does not vary and should be correct. However, not every row within the chunk of data has an element in every column. As a result given any single row of data within the chunk, I can not tell how many columns there are.

I would like to determine two things:

1) The number of columns within the chunk of data in question.

2) The approximate bounds of each one of these columns.

I’m pretty sure Math can be applied to help me do this, but I’m not really sure how to go about it conceptually. I’m thinking standard deviation might be able to be used, but I’m not sure how to apply it to X number of columns.

Any help you guys can provide, or ideas on how to attack it would greatly be appreciated!

[EDITED To Add Sample Data]

Below is summary data from queries that have already been used to attempt to round answers. As summary data, its less precise, but will probably give an idea of what is being run into. The “row” portion is left out of the summary data as things were combined, but I do have a concept of row within the full dataset.



“section_id”    “abs_left”  “abs_right” “count”

“1” “0”     “4”     “144”

“1” “1”     “4”     “4”

“1” “8”     “12”    “152”

“1” “40”    “59”    “4”

“1” “41”    “57”    “2”

“1” “41”    “60”    “45”

“1” “43”    “44”    “2”

“1” “48”    “63”    “88”

“1” “50”    “65”    “1”

“1” “54”    “64”    “11”

“3” “0”     “15”    “2”

“3” “1”     “10”    “4”

“3” “58”    “60”    “1”

“3” “58”    “69”    “3”

“3” “63”    “70”    “5”

“3” “66”    “72”    “10”

“3” “67”    “73”    “5”

“3” “82”    “87”    “3”

“3” “96”    “104”   “6”

“3” “100”   “104”   “2”

“3” “114”   “122”   “25”

“3” “129”   “137”   “15”

“3” “130”   “137”   “20”

“3” “133”   “137”   “1”

“3” “143”   “151”   “38”

“3” “146”   “151”   “1”

“3” “165”   “172”   “3”

“3” “168”   “175”   “36”

“4” “4”     “10”    “6”

“4” “4”     “21”    “18”

“4” “5”     “25”    “9”

“4” “5”     “30”    “10”

“4” “5”     “34”    “21”

“4” “6”     “41”    “7”

“4” “6”     “43”    “1”

“4” “55”    “64”    “3”

“4” “70”    “76”    “3”

“4” “75”    “83”    “42”

“4” “76”    “84”    “4”

“4” “77”    “82”    “11”

“4” “93”    “100”   “16”

“4” “95”    “101”   “13”

“4” “95”    “101”   “7”

“4” “104”   “110”   “2”

“4” “108”   “116”   “27”

“4” “123”   “130”   “37”

“4” “139”   “143”   “1”

“4” “139”   “146”   “75”

“4” “143”   “147”   “2”

Section 1 has 3 columns.

Section 3 has 7 columns.

Section 4 has 7 columns.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T06:45:45+00:00

Plan of attack

Project each item range onto the number line, “casting a shadow” where it is placed
Analyse the projection to find the edges and then number each contiguous “shadow” as a column
Map each item to it’s column number, then analyse the map

First create a ‘number line’ of integers that span the domain of the input data (below is 0-255)

create table integers as
select
  bit1.n+bit2.n+bit3.n+bit4.n+bit5.n+bit6.n+bit7.n+bit8.n as n
from
             (select 0 n union all select   1) bit1
  cross join (select 0 n union all select   2) bit2
  cross join (select 0 n union all select   4) bit3
  cross join (select 0 n union all select   8) bit4
  cross join (select 0 n union all select  16) bit5
  cross join (select 0 n union all select  32) bit6
  cross join (select 0 n union all select  64) bit7   
  cross join (select 0 n union all select 128) bit8
;

Then for each section_id, take a projection of the range (abs_left,abs_right) onto the number line and store in a temporary table

create table temp_item_distribution (
  is_start_of_column int
, column_number int
, primary key (section_id, n)
) as
select
  section_id
, n
, sum(is_match) as matches
from
  (
    select
      section_id
    , n
    , 0 as is_match
    from 
      (select distinct section_id from items) s
      cross join integers
    union all
    select
      section_id
    , n
    , 1 
    from
      items i
      inner join integers z
        on z.n between i.abs_left and i.abs_right
  ) t
group by
  section_id
, n;

Now, find and label the left-most edge of each column

  update
    temp_item_distribution r
    left join temp_item_distribution l
      on l.section_id = r.section_id
      and l.n = r.n - 1
  set
    r.is_start_of_column = case when coalesce(l.matches, 0) = 0 and r.matches > 0 then 1 else 0 end
  ;

Now, using this label, we can number and label the columns themselves

  update
    temp_item_distribution t
    inner join (
        select 
          r.section_id
        , r.n
        , sum(l.is_start_of_column) as column_number
        from
          temp_item_distribution l
          inner join temp_item_distribution r
            on l.section_id = r.section_id 
             and l.n <= r.n
        group by
          r.section_id
        , r.n
      ) s
      on t.section_id = s.section_id
      and t.n = s.n
  set
    t.column_number = s.column_number
  where
    t.matches > 0
  ;

Now, we can map the items back onto the columns

  create table temp_items_in_columns as
  select
    i.section_id
  , i.abs_left
  , i.abs_right
  , t.column_number
  from
    items i
    inner join temp_item_distribution t
      on i.section_id = t.section_id 
      and i.abs_left = t.n
  ;

Now, we can actually answer the question (..!)

  select
    section_id
  , max(column_number) as number_of_columns
  from
    temp_items_in_columns
  group by
    section_id
  ;

+------------+-------------------+
| section_id | number_of_columns |
+------------+-------------------+
|          1 |                 3 |
|          3 |                 8 |
|          4 |                 7 |
+------------+-------------------+

And the edges:

  select
    section_id
  , column_number
  , min(abs_left)                                 as far_left
  , round(avg(abs_left) - stddev(abs_left),1)     as 1_sigma_left
  , round(avg(abs_right) + stddev(abs_right),1)   as 1_sigma_right
  , max(abs_right)                                as far_right
  from
    temp_items_in_columns
  group by
    section_id
  , column_number
  ;

+------------+---------------+----------+--------------+---------------+-----------+
| section_id | column_number | far_left | 1_sigma_left | 1_sigma_right | far_right |
+------------+---------------+----------+--------------+---------------+-----------+
|          1 |             1 |        0 |          0.0 |           4.0 |         4 |
|          1 |             2 |        8 |          8.0 |          12.0 |        12 |
|          1 |             3 |       40 |         40.3 |          63.9 |        65 |
|          3 |             1 |        0 |          0.0 |          15.0 |        15 |
|          3 |             2 |       58 |         58.6 |          73.4 |        73 |
|          3 |             3 |       82 |         82.0 |          87.0 |        87 |
|          3 |             4 |       96 |         96.0 |         104.0 |       104 |
|          3 |             5 |      114 |        114.0 |         122.0 |       122 |
|          3 |             6 |      129 |        129.0 |         137.0 |       137 |
|          3 |             7 |      143 |        143.0 |         151.0 |       151 |
|          3 |             8 |      165 |        165.0 |         175.0 |       175 |
|          4 |             1 |        4 |          4.2 |          39.9 |        43 |
|          4 |             2 |       55 |         55.0 |          64.0 |        64 |
|          4 |             3 |       70 |         71.8 |          84.4 |        84 |
|          4 |             4 |       93 |         93.4 |         101.1 |       101 |
|          4 |             5 |      104 |        104.0 |         116.0 |       116 |
|          4 |             6 |      123 |        123.0 |         130.0 |       130 |
|          4 |             7 |      139 |        138.4 |         147.0 |       147 |
+------------+---------------+----------+--------------+---------------+-----------+

(Using a 1-standard-deviation interval, which covers about 68% for a normal distribution. See: http://en.wikipedia.org/wiki/Standard_deviation)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve got a series of items in a MySQL database. Each item has four

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply