Let’s say I have two very long series – big and small
index = pd.date_range(start='1952', periods=10**6, freq='s')
big = pd.Series(np.ones(len(index))*97, index)
small = pd.Series(np.ones(len(index))*2, index)
What I would like to achieve is create a new series which combines big and small, alternating between their values, using borders to determine when to switch to the other one (e.g. there is a border every 5 sec)
borders = pd.date_range(start='1952', periods=len(index)/5.0, freq='5s')
Is there an efficient matrix-based operation combo that can be used to achieve this? I tried looking at various join, merge etc. operators in the docs, but couldn’t find anything offering similar logic.
I could achieve this using a for-loop, but that lasts over a minute even for a series of len() 10ˆ5
alternating = pd.Series()
for i in range(1, 100, 2):
b0 = borders[i-1]
b1 = borders[i]
b2 = borders[i+1]
sec = pd.offsets.Second(1)
alternating = alternating.append(small[b0:b1-sec]).append(big[b1:b2-sec])
Sample output of alternating.head(24)
1952-01-16 00:00:00 2
1952-01-16 00:00:01 2
1952-01-16 00:00:02 2
1952-01-16 00:00:03 2
1952-01-16 00:00:04 2
1952-01-16 00:00:05 97
1952-01-16 00:00:06 97
1952-01-16 00:00:07 97
1952-01-16 00:00:08 97
1952-01-16 00:00:09 97
1952-01-16 00:00:10 2
1952-01-16 00:00:11 2
1952-01-16 00:00:12 2
1952-01-16 00:00:13 2
1952-01-16 00:00:14 2
1952-01-16 00:00:15 97
1952-01-16 00:00:16 97
1952-01-16 00:00:17 97
1952-01-16 00:00:18 97
1952-01-16 00:00:19 97
1952-01-16 00:00:20 2
1952-01-16 00:00:21 2
1952-01-16 00:00:22 2
1952-01-16 00:00:23 2
If your period is a fraction of a minute, you can try something like this:
alternatinglooks then exactly as you asked and is calculated within 150ms.