Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8097139
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T21:39:52+00:00 2026-06-05T21:39:52+00:00

Lets say we have a database table with two columns, entry_time and value. entry_time

  • 0

Lets say we have a database table with two columns, entry_time and value. entry_time is timestamp while value can be any other datatype. The records are relatively consistent, entered in roughly x minute intervals. For many x’s of time, however, an entry may not be made, thus producing a ‘gap’ in the data.

In terms of efficiency, what is the best way to go about finding these gaps of at least time Y (both new and old) with a query?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T21:39:54+00:00Added an answer on June 5, 2026 at 9:39 pm

    To start with, let us summarize the number of entries by hour in your table.

    SELECT CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME) hour,
           COUNT(*) samplecount
      FROM table
     GROUP BY CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME)
    

    Now, if you log something every six minutes (ten times an hour) all your samplecount values should be ten. This expression: CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME) looks hairy but it simply truncates your timestamps to the hour in which they occur by zeroing out the minute and second.

    This is reasonably efficient, and will get you started. It’s very efficient if you can put an index on your entry_time column and restrict your query to, let’s say, yesterday’s samples as shown here.

    SELECT CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME) hour,
           COUNT(*) samplecount
      FROM table
     WHERE entry_time >= CURRENT_DATE - INTERVAL 1 DAY
       AND entry_time < CURRENT_DATE
     GROUP BY CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME)
    

    But it isn’t much good at detecting whole hours that go by with missing samples. It’s also a little sensitive to jitter in your sampling. That is, if your top-of-the-hour sample is sometimes a half-second early (10:59:30) and sometimes a half-second late (11:00:30) your hourly summary counts will be off. So, this hour summary thing (or day summary, or minute summary, etc) is not bulletproof.

    You need a self-join query to get stuff perfectly right; it’s a bit more of a hairball and not nearly as efficient.

    Let’s start by creating ourselves a virtual table (subquery) like this with numbered samples. (This is a pain in MySQL; some other expensive DBMSs make it easier. No matter.)

      SELECT @sample:=@sample+1 AS entry_num, c.entry_time, c.value
        FROM (
            SELECT entry_time, value
          FROM table
             ORDER BY entry_time
        ) C,
        (SELECT @sample:=0) s
    

    This little virtual table gives entry_num, entry_time, value.

    Next step, we join it to itself.

    SELECT one.entry_num, one.entry_time, one.value, 
           TIMEDIFF(two.value, one.value) interval
      FROM (
         /* virtual table */
      ) ONE
      JOIN (
         /* same virtual table */
      ) TWO ON (TWO.entry_num - 1 = ONE.entry_num)
    

    This lines up the tables next two each other offset by a single entry, governed by the ON clause of the JOIN.

    Finally we choose the values from this table with an interval larger than your threshold, and there are the times of the samples right before the missing ones.

    The over all self join query is this. I told you it was a hairball.

    SELECT one.entry_num, one.entry_time, one.value, 
           TIMEDIFF(two.value, one.value) interval
      FROM (
        SELECT @sample:=@sample+1 AS entry_num, c.entry_time, c.value
          FROM (
              SELECT entry_time, value
                FROM table
               ORDER BY entry_time
          ) C,
          (SELECT @sample:=0) s
      ) ONE
      JOIN (
        SELECT @sample2:=@sample2+1 AS entry_num, c.entry_time, c.value
          FROM (
              SELECT entry_time, value
                FROM table
               ORDER BY entry_time
          ) C,
          (SELECT @sample2:=0) s
      ) TWO ON (TWO.entry_num - 1 = ONE.entry_num)
    

    If you have to do this in production on a large table you may want to do it for a subset of your data. For example, you could do it each day for the previous two days’ samples. This would be decently efficient, and would also make sure you didn’t overlook any missing samples right at midnight. To do this your little rownumbered virtual tables would look like this.

      SELECT @sample:=@sample+1 AS entry_num, c.entry_time, c.value
        FROM (
            SELECT entry_time, value
          FROM table
             ORDER BY entry_time
             WHERE entry_time >= CURRENT_DATE - INTERVAL 2 DAY
               AND entry_time < CURRENT_DATE /*yesterday but not today*/
        ) C,
        (SELECT @sample:=0) s
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Lets say I have a database table which consists of three columns: id ,
Lets say I have a table with just two columns: name and mood .
Lets say I have two MySQL databases with some complex table structures. Neither database
Okay, so let's say I have a mysql database table with two columns, one
Lets say I have a simple table that only contains two columns: MailingListUser -
Let's say I have two tables in my database. TABLE:Categories ID|CategoryName 01|CategoryA 02|CategoryB 03|CategoryC
Let's say you have a database with two tables named clients and referrals. TABLE
Lets say that I have two tables. The first is: table lists, with list_id
Let's say we have a database with a table that has many other associated
Have have two tables in two different databases: Lets say i have Database users

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.