I’m working on an NDB based Google App Engine application that needs to keep

Question

0

Asked: June 17, 20262026-06-17T12:08:49+00:00 2026-06-17T12:08:49+00:00

I’m working on an NDB based Google App Engine application that needs to keep

0

I’m working on an NDB based Google App Engine application that needs to keep track of the day/night cycle of a large number (~2000) fixed locations. Because the latitude and longitude don’t ever change, I can precompute them ahead of time using something like PyEphem. I’m using NDB. As I see it, the possible strategies are:

To precompute a year’s worth of sunrises into datetime objects, put
them into a list, pickle the list and put it into a PickleProperty
, but put the list into a JsonProperty
Go with DateTimeProperty and set repeated=True

Now, I’d like the very next sunrise/sunset property to be indexed, but that can be popped from the list and places into it’s own DateTimeProperty, so that I can periodically use a query to determine which locations have changed to a different part of the cycle. The whole list does not need to be indexed.

Does anyone know the relative effort -in terms of indexing and CPU load for these three approaches? Does repeated=True have an effect on the indexing?

Thanks,
Dave

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T12:08:50+00:00

The answers that suggest “just calculate them when instance starts” or “precompute those structures and output them into hardcoded python structures” appear to be ignoring the times-365 multiplier entailed by storing a year’s worth of sunrises, or the times-2000 multiplier if computing is done when an instance starts. Use pyEphem, 2000 sunrises and sunsets take more than two seconds to compute. Storing a year of sunrises and sunsets for 2000 locations in source code might use upwards of 20 megabytes. If the numbers are efficiently pickled, 2*365*2000*8 = 11,680,000 bytes are needed.

An approach that works faster and better is to set up a least-squares model for the times at one location in terms of those at others. This allows a roughly 70-fold reduction in total space used, as described below.

First, if points A and B are at the same latitude and have similar altitude and horizon parameters, then sunrise at A occurs at a constant time offset vs sunrise at B. For example, if A is 15 degrees west of B, sunrise occurs an hour later at A than at B. Second, if points A, B, C are at the same longitude and at low latitudes, the sunrise times at one point can be computed fairly accurately as a linear combination of the other two. At high latitudes or for better accuracy, linear combinations of several time curves can be used. Third, time of sunrise at point A on 20 March, the day of the spring equinox, can be used as a normalization point, so all calculations can be normalized to the same latitude.

The following table shows what sort of accuracy results using linear combinations of four time curves. For longitudes up to 46° away from the equator, results stay within about half a second. For 48° to 60°, results stay within 5 seconds. At 64°, results may be up to two minutes in error, and at 65°, up to about six minutes. But these times are probably good enough for most practical purposes. Note, at 66° the program shown below breaks down because it does not handle an exception pyEphem throws; “AlwaysUpError: ‘Sun’ is still above the horizon at 2013/6/14 07:20:15” occurs, even though 66° is below the Arctic Circle, 66.5622° N.

It is easy to modify the program so that it uses as many time curves as desired (see various lata = ... statements in program), giving whatever accuracy is desired but at the cost of storing more curves and more coefficients. Of course the model can be varied to use subsets of time curves; for example, 10 curves could be stored and calculations done based on the 4 nearest in latitude to any given target latitude. However, for this demo program such refinements are not in place.

Lat.  0.0:  Error range: -0.000000 to 0.000000 seconds
Lat.  5.0:  Error range: -0.370571 to 0.424092 seconds
Lat. 10.0:  Error range: -0.486193 to 0.557997 seconds
Lat. 15.0:  Error range: -0.414288 to 0.477041 seconds
Lat. 20.0:  Error range: -0.213614 to 0.247057 seconds
Lat. 25.0:  Error range: -0.065826 to 0.056358 seconds
Lat. 30.0:  Error range: -0.382425 to 0.323623 seconds
Lat. 35.0:  Error range: -0.585914 to 0.488351 seconds
Lat. 40.0:  Error range: -0.490303 to 0.400563 seconds
Lat. 45.0:  Error range: -0.164706 to 0.207415 seconds
Lat. 47.0:  Error range: -0.590103 to 0.756647 seconds
Lat. 48.0:  Error range: -0.852844 to 1.102608 seconds
Lat. 50.0:  Error range: -1.478688 to 1.940351 seconds
Lat. 55.0:  Error range: -3.342506 to 4.696076 seconds
Lat. 60.0:  Error range: -0.000002 to 0.000003 seconds
Lat. 61.0:  Error range: -7.012057 to 4.273954 seconds
Lat. 62.0:  Error range: -21.374033 to 12.347188 seconds
Lat. 63.0:  Error range: -51.872753 to 27.853411 seconds
Lat. 64.0:  Error range: -124.000365 to 59.661029 seconds
Lat. 65.0:  Error range: -351.425224 to 139.656187 seconds

Using the approach outlined above, for each of the 2000 locations, you need to store five floating point numbers: time of sunrise on 20 March, and four multiplier coefficients for four time curves. (The 70-fold reduction mentioned earlier is from storing 5 numbers per location, rather than 365 numbers.) For each time curve, 365 numbers are stored, with entry i being the sunrise time difference vs that on 20 March. Storing four time curves uses 1/500 as much space as storing 2000 of them, so curve storage space is dominated by that for multiplier coefficients.

Before I give the program that uses scipy.optimize.leastsq to solve for coefficients, here are two code snippets that can be used, in the ipython interpreter, to make accuracy tables and to draw plots for visualizing errors.

import sunrise as sr
for lat in range(0, 65, 5):
    sr.lsr(lat, -110, 2013, 4)

The above produces most of the error table shown earlier. The third parameter of lsr is called daySkip and the value 4 makes lsr work with every fourth day (ie only about 90 days of the year) for faster testing. Using sr.lsr(lat, -110, 2013, 1) produces similar results but takes four times as long.

sr.plotData(15,1./(24*3600))

The above tells sunrise.plotData to plot everything (the sunrise data to be approximated; the model’s resulting approximation; the residuals, scaled to be in seconds; and the cardinal curves.)

The program is shown below. Note that it has been tested mostly for Northern hemisphere longitudes. If time curves are symmetric enough, the program as-is will handle Southern hemisphere longitudes; if errors are too large, Southern hemisphere longitudes can be added in to the cardinal curves or the model can be changed to use a separate set of curves south of the equator. Note that sunsets aren’t calculated in this program. For sunsets, add next_setting(ephem.Sun()) calls analogous to the previous_rising(ephem.Sun()) calls, and store an additional four time curves.

#!/usr/bin/python
import ephem, numpy, scipy, scipy.optimize

# Make a set of observers (observation points)
def observers(lata, lona):
    def makeIter(x):
        if hasattr(x, '__iter__'):
            return x
        return [x]
    lata, lona = makeIter(lata), makeIter(lona)
    arr = []
    for lat in lata:
        for lon in lona:
            o = ephem.Observer()
            o.lat, o.lon, o.elevation, o.temp = str(lat), str(lon), 1400, 0
            arr.append(o)
    return tuple(arr)

# Make a year of data for an observer, equinox-relative
def riseData(observr, year, skip):
    yy = ephem.Date('{} 00:00'.format(year))
    rr = numpy.arange(0.0, 366.0, skip)
    springEquinox = 78
    observr.date = ephem.Date(yy + springEquinox)
    seDelta = observr.previous_rising(ephem.Sun()) - yy - springEquinox + 1
    for i, day in enumerate(range(0, 366, skip)):
        observr.date = ephem.Date(yy + day)
        t = observr.previous_rising(ephem.Sun()) - yy - day + 1 - seDelta
        rr[i] = t
    return numpy.array(rr)

# Make a set of time curves
def makeRarSet(lata, lona, year, daySkip):
    rar=[]
    for o in observers(lata, lona):
        r = riseData(o, year, daySkip)
        rar.append(r)
    x = numpy.arange(0., 366., daySkip)
    return (x, rar)

# data() is an object that stores curves + results
def data(s):
    return data.ss[s]

# Initialize data.ss
def setData(lata, lona, year, daySkip):
    x, rar = makeRarSet(lata, lona, year, daySkip)
    data.ss = rar

# Compute y values from model, given vector x and given params in p
def yModel(x, p):
    t = numpy.zeros(len(x))
    for i in range(len(p)):
        t += p[i] * data(i)
    return t

# Compute residuals, given params in p and data in x, y vectors.
# x = independent var,  y = dependent = observations
def residuals(p, y, x):
    err = y - yModel(x, p)
    return err

# Compute least squares result
def lsr(lat, lon, year, daySkip):
    latStep = 13.
    lata = numpy.arange(0., 66.4, latStep)
    lata = [ 88 * (1 - 1.2**-i) for i in range(8)]
    l, lata, lstep, ldown = 0, [], 20, 3
    l, lata, lstep, ldown = 0, [], 24, 4
    while l < 65:
        lata.append(l); l += lstep; lstep -= ldown
    #print 'lata =', lata
    setData(lata, lon, year, daySkip)
    x, ya = makeRarSet(lat, lon, year, daySkip)
    x, za = makeRarSet(lat, 10+lon, year, daySkip)
    data.ss.append(za[0])
    y = ya[0]
    pini = [(0 if abs(lat-l)>latStep else 0.5) for l in lata]
    pars = scipy.optimize.leastsq(residuals, pini, args=(y, x))
    data.x, data.y, data.pv = x, y, yModel(x, pars[0])
    data.par, data.err = pars, residuals(pars[0], y, x)
    #print 'pars[0] = ', pars[0]
    variance = numpy.inner(data.err, data.err)/len(y)
    #print 'variance:', variance
    sec = 1.0/(24*60*60)
    emin, emax = min(data.err), max(data.err)
    print ('Lat. {:4.1f}:  Error range: {:.6f} to {:.6f} seconds'.format(lat, emin/sec, emax/sec))

def plotData(iopt, emul):
    import matplotlib.pyplot as plt
    plt.clf()
    x = data.x
    if iopt == 0:
        iopt = 15
        emul = 1
    if iopt & 1:
        plt.plot(x, data.y)
        plt.plot(x, data.y + 0.001)
        plt.plot(x, data.y - 0.001)
    if iopt & 2:
        plt.plot(x, data.pv)
    if iopt & 4:
        plt.plot(x, emul*data.err)
    if iopt & 8:
        for ya in data.ss:
            plt.plot(x, ya)
    plt.show()

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m working on an NDB based Google App Engine application that needs to keep

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply