Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8567333
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T17:51:35+00:00 2026-06-11T17:51:35+00:00

I have a large .txt with data in bad formats. I would like to

  • 0

I have a large .txt with data in bad formats. I would like to remove some rows and convert rest of data to float numbers. I would like to remove rows with 'X' or 'XX', The rest I should convert to float, number like 4;00.1 should be converted to 4.001 The file looks like this sample:

0,1,10/09/2012,3:01,4;09.1,5,6,7,8,9,10,11
1,-0.581586,11/09/2012,-1:93,0;20.3,739705,,0.892921,5,,6,7
2,XX,10/09/2012,3:04,4;76.0,0.183095,-0.057214,-0.504856,NaN,0.183095,12
3,-0.256051,10/09/2012,9:65,1;54.9,483293,0.504967,0.074442,-1.716287,7,0.504967,0.504967
4,-0.728092,11/09/2012,0:78,1;53.4,232247,4.556,0.328062,1.382914,NaN,4.556,4
5,4,11/09/2012,NaN,NaN,6.0008,NaN,NaN,NaN,6.000800,6.000000,6.000800
6,X,11/09/2012,X,X,5,X,8,2,1,17.000000,33.000000
7,,11/09/2012,,,,,,6.000000,5.000000,2.000000,2.000000
8,4,11/09/2012,7:98,3;04.5,5,6,3,7.000000,3.000000,3.000000,2
9,6,11/09/2012,2:21,4;67.2,5,2,2,7,3,8.000000,4.000000

I read it to DataFrame and choose rows

from pandas import *
from csv import *
fileName = '~/data.txt'
colName = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']
df = DataFrame(read_csv(fileName, names=colName))
print df[df['b'].isin(['X','XX',None,'NaN'])].to_string()

An output from last last line gives me only:

>>> print df[df['b'].isin(['X','XX',None,'NaN'])].to_string()
    b           c     d       e         f          g         h   i         j   k   l
a                                                                                   
2  XX  10/09/2012  3:04  4;76.0  0.183095  -0.057214 -0.504856 NaN  0.183095  12 NaN
6   X  11/09/2012     X       X  5.000000          X  8.000000   2  1.000000  17  33

Does not pick up row 7, and I would like to go through all df not only one column (original file is very large).

At the moment for conversion I use as below, but need remove unwanted rows first to apply it to all df.

convert1 = lambda x : x.replace('.', '')
convert2 = lambda x : float(x.replace(';', '.'))
newNumber = convert2(convert1(df['e'][0])) 

After choosing rows I would like to remove them from df, I try df.pop() but it works only for column not for rows. I try to name rows but don’t luck. In this particular .txt I should finish with a new df from rows [0,3,8,9] with column ‘c’ as a date format, ‘d’ as a time format and the rest as the float. I try to figure it out for quite a while now, but do not know where to move, is it possible in pandas (probably should be) or do I need to change to ndarray or anything else? Thanks for your advise

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T17:51:37+00:00Added an answer on June 11, 2026 at 5:51 pm

    The problem with your original filter is it checks for ‘NaN’ rather than numpy.nan, which is what empty strings are parsed as by default.
    If you want to filter all the columns so you only get rows where no element is ‘X’ or ‘XX’, do something like this:

    In [45]: names = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']
    
    In [46]: df = pd.read_csv(StringIO(data), header=None, names=names)
    
    In [47]: mask = df.applymap(lambda x: x in ['X', 'XX', None, np.nan])
    
    In [48]: df[-mask.any(axis=1)]
    Out[48]: 
    <class 'pandas.core.frame.DataFrame'>
    Int64Index: 5 entries, 0 to 9
    Data columns:
    a    5  non-null values
    b    5  non-null values
    c    5  non-null values
    d    5  non-null values
    e    5  non-null values
    f    5  non-null values
    g    5  non-null values
    h    5  non-null values
    i    5  non-null values
    j    4  non-null values
    k    5  non-null values
    l    5  non-null values
    dtypes: float64(6), int64(1), object(5)
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a large .txt file of records that I need to convert into
I have large images displayed in a grouped tableview. I would like the images
Hi I have a large list of data: http://paste-it.net/public/y17027d/ It is 67859 rows by
I have a fairly large .txt file ~9gb and I will like to load
We have data files which are encrypted in our wpf app. We would like
I have a large txt file, which contains blocks of data as follows: AB
Let's say I have a really large file foo.txt and I want to iterate
i have large numbers of text files and i am in problem that i
I have this large (and oddly formatted txt file) from the USDA's website. It
I have a large auditing stored procedure that prints values and runs some SELECT

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.