Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8126475
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T07:09:40+00:00 2026-06-06T07:09:40+00:00

I have two data sets in google app engine datastore. class First_Set(db.Model): start_time =

  • 0

I have two data sets in google app engine datastore.

class First_Set(db.Model):
  start_time = db.DateTimeProperty()
  end_time = db.DateTimeProperty()
  data1 = db.FloatProperty()
  ...

class Second_Set(db.Model):
  start_time = db.DateTimeProperty()
  end_time = db.DateTimeProperty()
  data2 = db.FloatProperty()
  ...

(They have other different data that’s why they’re in different datasets.)

I’d like to find the datastore IDs all the overlapping start_time and end_time across two datasets, ideally without pulling results from one and iterating the first results over the other.

A great visualization of the initial dataset is from here (it also has the problem solved in SQL):

1     |-----| 
2        |-----| 
3                 |--| 
4                       |-----| 
5                          |-----| 
6                                  |---| 
7                                        |---|  
8                           |---| 
9                                       |-----|

End result I need is something in the tune of (from the same example):

+----+---------------------+----+---------------------+ 
| id | start               | id | end                 | 
+----+---------------------+----+---------------------+ 
|  2 | 2008-09-01 15:02:00 |  1 | 2008-09-01 15:04:00 | 
|  5 | 2008-09-01 16:19:00 |  4 | 2008-09-01 16:23:00 | 
|  8 | 2008-09-01 16:20:00 |  4 | 2008-09-01 16:22:00 | 
|  8 | 2008-09-01 16:20:00 |  5 | 2008-09-01 16:22:00 | 
|  7 | 2008-09-01 18:18:00 |  9 | 2008-09-01 18:22:00 | 
+----+---------------------+----+---------------------+ 

SQL solution is described in the example as below but I couldn’t do this in datastore because of lack of JOIN:

SELECT v1.id, v1.start, v2.id, LEAST(v1.end,v2.end) AS end 
FROM visits v1 
JOIN visits v2 ON v1.id <> v2.id and v1.start >= v2.start and v1.start < v2.end  
ORDER BY v1.start;

I understand that one-to-many version of this is rather straightforward using a ListProperty() (from this question).

Can anyone think of a solution to find the overlapping times (ideally in Python)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T07:09:42+00:00Added an answer on June 6, 2026 at 7:09 am

    Posting my solution with no JOINs, thanks to Shay’s direction. Should be able to find overlaps over any number of datasets with minor edits (at least that’s the theory).

    My Python isn’t that great but below should give the idea:

    from operator import itemgetter
    
    class Find_Overlaps(webapp2.RequestHandler):
        def get(self):
            all_dates = []
            first_dates = db.GqlQuery("SELECT * FROM First_Set")
            for date in first_dates:
                row = {'dataset':'First_Set', 'dbkey':date.key(), 'offset':date.start_time, 'type': -1}
                all_dates.append(row)
                row = {'dataset':'First_Set', 'dbkey':date.key(), 'offset':date.end_time, 'type': 1}
                all_dates.append(row)
    
            second_dates = db.GqlQuery("SELECT * FROM Second_Set")
            for date in second_dates:
                row = {'dataset':'Second_Set', 'dbkey':date.key(), 'offset':date.start_time, 'type': -1}
                all_dates.append(row)
                row = {'dataset':'Second_Set', 'dbkey':date.key(), 'offset':date.end_time, 'type': 1}
                all_dates.append(row)
    
            newlist = sorted(all_dates, key=itemgetter('offset','type'))
            number_datasets = 2 #goal is to find overlaps in all sets not only the best overlaps, that's why this is needed
            loopcnt = 0
            update_bestend = 0
            overlaps = []
            for row in newlist: #Below is mostly from Marzullo's alghorithm
                loopcnt = loopcnt - row['type']#this is to keep track of overall tally
                if update_bestend == 1:
                    if loopcnt == (number_datasets - 1):
                        bestend = row['offset']
                        end_set = row['dataset']
                        end_key = row['dbkey']
                        overlaps.append({'start':beststart,'start_set':start_set,'start_key':start_key,'end':bestend,'end_set':end_set,'end_key':end_key})
                        update_bestend = 0
                if loopcnt == number_datasets:
                    beststart = row['offset']
                    start_set = row['dataset']
                    start_key = row['dbkey']
                    update_bestend = 1
    
            for overlap in overlaps: #just to see what the outcome is
                self.response.out.write('start: %s, start_set: %s, end: %s, end_set: %s<br>' % (overlap['start'], overlap['start_set'], overlap['end'], overlap['end_set']))
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have two data sets, each dataset contains the pixel values of an image.
I have two types of data sets. Both are in same size. One contains
I have two sets of data, (Ax, Ay; Bx, By). I'd like to plot
I have two sets of data in this form: x | y | z
I have two constructors for an objects, which use two different sets of data.
I have two large data sets and I am attempting to reformat the older
Have two sets of data (two tables) for patient records, one 1999-2003, the other
I have two sets of data that contains some of the same information. This
Assume that we have two data sets A, B that have m to n
I have a two data sets as lists, for example: xa = [1, 2,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.