I have a csv file with date, time., price, mag, signal. 62035 rows; there

Question

0

Asked: June 16, 20262026-06-16T14:15:35+00:00 2026-06-16T14:15:35+00:00

I have a csv file with date, time., price, mag, signal. 62035 rows; there

0

I have a csv file with date, time., price, mag, signal.
62035 rows; there are 42 times of day associated to each unique date in the file.

For each date, when there is an ‘S’ in the signal column append the corresponding price at the time the ‘S’ occurred. Below is the attempt.

from pandas import *
from numpy import *
from io import *
from os import *
from sys import *

DF1 = read_csv('___.csv')
idf=DF1.set_index(['date','time','price'],inplace=True)
sStore=[]
for i in idf.index[i][0]:
  sStore.append([idf.index[j][2] for j in idf[j][1] if idf['signal']=='S'])
sStore.head()

Traceback (most recent call last)
<ipython-input-7-8769220929e4> in <module>()
  1 sStore=[]
  2 
----> 3 for time in idf.index[i][0]:
  4 
  5     sStore.append([idf.index[j][2] for j in idf[j][1] if idf['signal']=='S'])

 NameError: name 'i' is not defined

I do not understand why the i index is not permitted here. Thanks.

I also think it’s strange that :

idf.index.levels[0] will show the dates “not parsed” as it is in the file but out of order. Despite that parse_date=True as an argument in set_index.

I bring this up since I was thinking of side swiping the problem with something like:

for i in idf.index.levels[0]:

   sStore.append([idf.index[j][2] for j in idf.index.levels[1] if idf['signal']=='S'])

 sStore.head()

My edit 12/30/2012 based on DSM’s comment below:

I would like to use your idea to get the P&L, as I commented below. Where if S!=B, for any given date, we difference using the closing time, 1620.

v=[df["signal"]=="S"]
t=[df["time"]=="1620"]
u=[df["signal"]!="S"]

df["price"][[v and (u and t)]]

That is, “give me the price at 1620; (even when it doesn’t give a “sell signal”, S) so that I can diff. with the “extra B’s”–for the special case where B>S. This ignores the symmetric concern (where S>B) but for now I want to understand this logical issue.

On traceback, this expression gives:

ValueError: boolean index array should have 1 dimension

Note that in order to invoke df[“time’] I do not set_index here. Trying the union operator | gives:

TypeError: unsupported operand type(s) for |: 'list' and 'list'

Looking at Max Fellows’s approach,

@Max Fellows

The point is to close out the positions at the end of the day; so we need to capture to price at the close to “unload” all those B, S which were accumulated; but didn’t net each other out.
If I say:

filterFunc1 = lambda row: row["signal"] == "S" and ([row["signal"] != "S"][row["price"]=="1620"])
filterFunc2 =lambda row: ([row["price"]=="1620"][row["signal"] != "S"])

filterFunc=filterFunc1 and filterFunc2

filteredData = itertools.ifilter(filterFunc, reader)

On traceback:

IndexError: list index out of range

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T14:15:36+00:00

Using @Max Fellows’ handy example data, we can have a look at it in pandas. [BTW, you should always try to provide a short, self-contained, correct example (see here for more details), so that the people trying to help you don’t have to spend time coming up with one.]

First, import pandas as pd. Then:

In [23]: df = pd.read_csv("sample.csv", names="date time price mag signal".split())

In [24]: df.set_index(["date", "time"], inplace=True)

which gives me

In [25]: df
Out[25]: 
                 price      mag signal
date       time                       
12/28/2012 1:30     10      foo      S
           2:15     11      bar      N
           3:00     12      baz      S
           4:45     13   fibble      N
           5:30     14  whatsit      S
           6:15     15     bobs      N
           7:00     16  widgets      S
           7:45     17  weevils      N
           8:30     18   badger      S
           9:15     19    moose      S
11/29/2012 1:30     10      foo      N
           2:15     11      bar      N
           3:00     12      baz      S
           4:45     13   fibble      N
           5:30     14  whatsit      N
           6:15     15     bobs      N
           7:00     16  widgets      S
           7:45     17  weevils      N
           8:30     18   badger      N
           9:15     19    moose      N
[etc.]

We can see which rows have a signal of S easily:

In [26]: df["signal"] == "S"
Out[26]: 
date        time
12/28/2012  1:30     True
            2:15    False
            3:00     True
            4:45    False
            5:30     True
            6:15    False
[etc..]

and we can select using this too:

In [27]: df["price"][df["signal"] == "S"]
Out[27]: 
date        time
12/28/2012  1:30    10
            3:00    12
            5:30    14
            7:00    16
            8:30    18
            9:15    19
11/29/2012  3:00    12
            7:00    16
12/29/2012  3:00    12
            7:00    16
8/9/2008    3:00    12
            7:00    16
Name: price

This is a DataFrame with every date, time, and price where there’s an S. And if you simply want a list:

In [28]: list(df["price"][df["signal"] == "S"])
Out[28]: [10.0, 12.0, 14.0, 16.0, 18.0, 19.0, 12.0, 16.0, 12.0, 16.0, 12.0, 16.0]

Update:

v=[df["signal"]=="S"] makes v a Python list containing a Series. That’s not what you want. df["price"][[v and (u and t)]] doesn’t make much sense to me either –: v and u are mutually exclusive, so if you and them together, you’ll get nothing. For these logical vector ops you can use & and | instead of and and or. Using the reference data again:

In [85]: import pandas as pd

In [86]: df = pd.read_csv("sample.csv", names="date time price mag signal".split())

In [87]: v=df["signal"]=="S"

In [88]: t=df["time"]=="4:45"

In [89]: u=df["signal"]!="S"

In [90]: df[t]
Out[90]: 
          date  time  price     mag signal
3   12/28/2012  4:45     13  fibble      N
13  11/29/2012  4:45     13  fibble      N
23  12/29/2012  4:45     13  fibble      N
33    8/9/2008  4:45     13  fibble      N

In [91]: df["price"][t]
Out[91]: 
3     13
13    13
23    13
33    13
Name: price

In [92]: df["price"][v | (u & t)]
Out[92]: 
0     10
2     12
3     13
4     14
6     16
8     18
9     19
12    12
13    13
16    16
22    12
23    13
26    16
32    12
33    13
36    16
Name: price

[Note: this question has now become too long and meandering. I suggest spending some time working through the examples in the pandas documentation at the console to get a feel for it.]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a csv file with date, time., price, mag, signal. 62035 rows; there

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply