I have a csv file with date, time., price, mag, signal.
62035 rows; there are 42 times of day associated to each unique date in the file.
For each date, when there is an ‘S’ in the signal column append the corresponding price at the time the ‘S’ occurred. Below is the attempt.
from pandas import *
from numpy import *
from io import *
from os import *
from sys import *
DF1 = read_csv('___.csv')
idf=DF1.set_index(['date','time','price'],inplace=True)
sStore=[]
for i in idf.index[i][0]:
sStore.append([idf.index[j][2] for j in idf[j][1] if idf['signal']=='S'])
sStore.head()
Traceback (most recent call last)
<ipython-input-7-8769220929e4> in <module>()
1 sStore=[]
2
----> 3 for time in idf.index[i][0]:
4
5 sStore.append([idf.index[j][2] for j in idf[j][1] if idf['signal']=='S'])
NameError: name 'i' is not defined
I do not understand why the i index is not permitted here. Thanks.
I also think it’s strange that :
idf.index.levels[0] will show the dates “not parsed” as it is in the file but out of order. Despite that parse_date=True as an argument in set_index.
I bring this up since I was thinking of side swiping the problem with something like:
for i in idf.index.levels[0]:
sStore.append([idf.index[j][2] for j in idf.index.levels[1] if idf['signal']=='S'])
sStore.head()
My edit 12/30/2012 based on DSM’s comment below:
I would like to use your idea to get the P&L, as I commented below. Where if S!=B, for any given date, we difference using the closing time, 1620.
v=[df["signal"]=="S"]
t=[df["time"]=="1620"]
u=[df["signal"]!="S"]
df["price"][[v and (u and t)]]
That is, “give me the price at 1620; (even when it doesn’t give a “sell signal”, S) so that I can diff. with the “extra B’s”–for the special case where B>S. This ignores the symmetric concern (where S>B) but for now I want to understand this logical issue.
On traceback, this expression gives:
ValueError: boolean index array should have 1 dimension
Note that in order to invoke df[“time’] I do not set_index here. Trying the union operator | gives:
TypeError: unsupported operand type(s) for |: 'list' and 'list'
Looking at Max Fellows’s approach,
@Max Fellows
The point is to close out the positions at the end of the day; so we need to capture to price at the close to “unload” all those B, S which were accumulated; but didn’t net each other out.
If I say:
filterFunc1 = lambda row: row["signal"] == "S" and ([row["signal"] != "S"][row["price"]=="1620"])
filterFunc2 =lambda row: ([row["price"]=="1620"][row["signal"] != "S"])
filterFunc=filterFunc1 and filterFunc2
filteredData = itertools.ifilter(filterFunc, reader)
On traceback:
IndexError: list index out of range
Using @Max Fellows’ handy example data, we can have a look at it in
pandas. [BTW, you should always try to provide a short, self-contained, correct example (see here for more details), so that the people trying to help you don’t have to spend time coming up with one.]First,
import pandas as pd. Then:which gives me
We can see which rows have a signal of
Seasily:and we can select using this too:
This is a
DataFramewith every date, time, and price where there’s anS. And if you simply want a list:Update:
v=[df["signal"]=="S"]makesva Pythonlistcontaining aSeries. That’s not what you want.df["price"][[v and (u and t)]]doesn’t make much sense to me either –:vanduare mutually exclusive, so if you and them together, you’ll get nothing. For these logical vector ops you can use&and|instead ofandandor. Using the reference data again:[Note: this question has now become too long and meandering. I suggest spending some time working through the examples in the
pandasdocumentation at the console to get a feel for it.]