Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9009109
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T02:06:29+00:00 2026-06-16T02:06:29+00:00

So Python, with the pandas module seems like a great option to matlab and

  • 0

So Python, with the pandas module seems like a great option to matlab and R. This is why I’ve very recently switched to this. There are resources out there, and I’ve searched the forum but couldn’t find anything similar. If you have links to some tutorials or other useful material out there, please post them.

Wes McKinney has a great and elaborate tutorial on pandas.
http://www.youtube.com/watch?v=w26x-z-BdWQ&list=FLJ5xKwlfj7wg8S_A5SgR6Wg&feature=mh_lolz

At 1:10 he shows an example of how to index the rows in a dataframe by dates rather than integers.
I would like to do something similar.

The difference is that I have 3 variables, Y1, Y2, Y3, each with a column of timestamps, X1, X2, X3.

TestFile.txt:  
X1  Y1  X2  Y2  X3  Y3
27/11/2012  11.436  29/11/2012  20.631  4/12/2012   10.209  
28/11/2012  11.468  30/11/2012  20.185  5/12/2012   9.973  
29/11/2012  11.414  3/12/2012   19.962  6/12/2012   9.736  
30/11/2012  11.355  4/12/2012   19.562  7/12/2012   9.509  
3/12/2012   11.309  5/12/2012   18.908  10/12/2012  9.259  
4/12/2012   11.118  6/12/2012   18.288  11/12/2012  8.109  
5/12/2012   10.873  7/12/2012   17.973  
6/12/2012   10.582  10/12/2012  17.788  
7/12/2012   10.264  11/12/2012  17.554  
10/12/2012  9.886  
11/12/2012  9.164  

Where I want to do 4 things:

  1. Associate data in Yi by its date in Xi for i = 1,2,3

  2. Index rows by dates

  3. Remove all data that is older than 4/12/2012 which is the first date of Y3

  4. Be able to access all date by date and column only

Here is a test file which describes how the data is read and how it prints.
You can see that X1 is correctly parsed to the pandas date format, but not X2 or X3. which is what I attempted to do by specifying
index_col=[0,2,4]
and
parse_dates = True

TestFile.py:
import pandas as pd

df = pd.read_csv('TestFile.txt',sep='\t', index_col=[0,2,4], parse_dates = True)

print 'pandas version: ', pd.__version__
print df

Gives output:

pandas version:  0.10.0b1
X1         X2         X3              Y1      Y2      Y3                   
2012-11-27 29/11/2012 4/12/2012   11.436  20.631  10.209
2012-11-28 30/11/2012 5/12/2012   11.468  20.185   9.973
2012-11-29 3/12/2012  6/12/2012   11.414  19.962   9.736
2012-11-30 4/12/2012  7/12/2012   11.355  19.562   9.509
2012-03-12 5/12/2012  10/12/2012  11.309  18.908   9.259
2012-04-12 6/12/2012  11/12/2012  11.118  18.288   8.109
2012-05-12 7/12/2012  None        10.873  17.973     NaN
2012-06-12 10/12/2012 None        10.582  17.788     NaN
2012-07-12 11/12/2012 None        10.264  17.554     NaN
2012-10-12 None       None         9.886     NaN     NaN
2012-11-12 None       None         9.164     NaN     NaN

Wanted output:

                Y1      Y2       Y3                 
2012-04-12  11.118  19.562   10.209
2012-05-12  10.873  18.908    9.973
2012-06-12  10.582  18.288    9.736
2012-07-12  10.264  17.973    9.509
2012-10-12   9.886  17.788    9.259
2012-11-12   9.164  17.554    8.109

If you have any idea of how to do this, your help is much appreciated:)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T02:06:30+00:00Added an answer on June 16, 2026 at 2:06 am

    I think your confusion is due to a misunderstanding about the index_col argument. When you pass a list of columns to index_col, pandas is attempting to create a multi-index, that is, a dataframe with more than one column as index, like a multi-dimensional table. It is NOT trying to create a single index by concatenating multiple columns.

    One strategy that would work is to create three dataframes with the appropriate pairs of columns from your input file, and then concatenate them.

    X1 Y1 X2 Y2 X3 Y3 –> Dataframe of (X1, Y1) + Dataframe of (X2, Y2) + Dataframe of (X3, Y3)

    If you are using the latest development version of Pandas, or are willing to, this is simplified by using the new parse_cols argument in read_csv(). Or you can read in all the data, extract the three dataframes you need, and then concatenate them.

    Finally, you can df.truncate with before and after arguments to get the DateRange you need. More simply, you could use dropna() to omit dates with missing values.

    Hope this helps. Do let us know what version of pandas you are using.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

(Python 2.7, Pandas 0.9) This seems like a simple thing to do, but I
I am trying to run a Winsorized regression in pandas for Python. The very
I would like to install Python Pandas library (0.8.1) on Mac OS X 10.6.8.
This is more of a hack that almost works. #!/usr/bin/env python from pandas import
I'm trying to get Pandas installed with Python 2.5 on a machine running Windows
Python allows aliasing of imports, through ...as <ALIAS> clauses in the import statement, like
I have some values in a Python Pandas Series (type: pandas.core.series.Series ) In [1]:
I want to convert a dict into sorted dict in python data = pandas.read_csv('D:\myfile.csv')
I'm using Python 2.7.3 in 64-bit. I installed pandas as well as matplotlib 1.1.1,
I've currently switched my focus from R to Python. I work with data.table in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.