Column y below should be [‘Reg’, ‘Reg’, ‘Swp’, ‘Swp’] In [1]: pd.read_csv(‘/tmp/test3.csv’) Out[1]: x,y

Question

0

Asked: June 17, 20262026-06-17T16:26:08+00:00 2026-06-17T16:26:08+00:00

Column y below should be [‘Reg’, ‘Reg’, ‘Swp’, ‘Swp’] In [1]: pd.read_csv(‘/tmp/test3.csv’) Out[1]: x,y

0

Column y below should be [‘Reg’, ‘Reg’, ‘Swp’, ‘Swp’]

In [1]: pd.read_csv('/tmp/test3.csv')  
Out[1]:  
x,y  
 ^@^@^@,Reg  
 ^@^@^@,Reg  
I,Swp  
I,Swp  

In [2]: ! cat /tmp/test3.csv  
     x    y  
0  
1  NaN  NaN  
2    I  Swp  
3    I  Swp    

In [3]: f = open('/tmp/test3.csv', 'rb'); print(repr(f.read()))  
'x,y\n \x00\x00\x00,Reg\n \x00\x00\x00,Reg\nI,Swp\nI,Swp\n'

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T16:26:10+00:00

Yes, I could reproduce the problem, but don’t know how to fix it with pd.read_csv. Here is a workaround:

In [46]: import numpy as np
In [47]: arr = np.genfromtxt('test3.csv', delimiter = ',', 
                             dtype = None, names = True)

In [48]: df = pd.DataFrame(arr)

In [49]: df
Out[49]: 
   x    y
0     Reg
1     Reg
2  I  Swp
3  I  Swp

Note that with names = True the first valid line of the csv is interpreted as column names (and therefore does not affect the dtype of the values on the subsequent lines.) Thus, if the csv file contains numerical data such as

In [22]: with open('/tmp/test.csv','r') as f:
   ....:     print(repr(f.read()))
   ....:     
'x,y,z\n \x00\x00\x00,Reg,1\n \x00\x00\x00,Reg,2\nI,Swp,3\nI,Swp,4\n'

Then genfromtxt will assign a numerical dtype to the third column (<i4 in this case).

In [19]: arr = np.genfromtxt('/tmp/test.csv', delimiter = ',', dtype = None, names = True)

In [20]: arr
Out[20]: 
array([('', 'Reg', 1), ('', 'Reg', 2), ('I', 'Swp', 3), ('I', 'Swp', 4)], 
      dtype=[('x', '|S3'), ('y', '|S3'), ('z', '<i4')])

However, if the numerical data is intermingled with bytes such as '\x00' then genfromtxt will be unable to recognize this column as numerical and will therefore resort to assigning a string dtype. Nevertheless, you can force the dtype of the columns by manually assigning the dtype parameter. For example,

In [11]: arr = np.genfromtxt('/tmp/test.csv', delimiter = ',', dtype = [('x', '|i4'), ('y', '|S3')], names = True)

sets the first column x to have dtype |i4 (4-byte integers) and the second column y to have dtype |S3 (3-byte string). See this doc page for more information on available dtypes.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Column y below should be [‘Reg’, ‘Reg’, ‘Swp’, ‘Swp’] In [1]: pd.read_csv(‘/tmp/test3.csv’) Out[1]: x,y

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply