A coworker left some data files I want to analyze with Numpy.
Each file is a matlab file, say data.m, and have the following formatting (but with a lot more columns and rows):
values = [-24.92 -23.66 -22.55 ;
-24.77 -23.56 -22.45 ;
-24.54 -23.64 -22.56 ;
];
which is the typical explicit matrix creation syntax used by matlab.
My question is: what would be the most practical way to create a numpy array from these files?
I could think about a “brute force” or a “quick and dirty” solution, but if there would be a more straightforward one, I would much rather use it, like a standard function from numpy or even from another module.
EDIT: I noticed that my files may contain NaN values, so I most probably will adapt the answers given to use numpy.genfromtxt instead of numpy.loadtxt. I plan to include my final code as soon as I have it.
Thanks for any help!
EDIT: I ended up with the following code, where I get everything between [] using regex, and create a numpy array using genfromtxt in order to handle NaN. A shorter solution could be to use fromstring method, which does not need StringIO, but this cannot handle NaN, and my data have NaN :oP
#!/usr/bin/env python
# coding: utf-8
import numpy, re, StringIO
with open('data.m') as f:
s = re.search('\[(.*)\]', f.read(), re.DOTALL).group(1)
buf = StringIO.StringIO(s)
a = numpy.genfromtxt(buf, missing_values='NaN', filling_values=numpy.nan)
Here are a couple options, although neither is built in.
The solution you probably do not find acceptable
This solution probably falls into your “quick and dirty” category, but it helps lead in to the next solution.
Remove the
values = [, the last line (];), and globally replace all;with nothing to get:Then you can use numpy’s
loadtxtas follows.A solution you might find acceptable
In this solution, we create a method to coerce the input data into a form that numpy
loadtxtlikes (the same form as above, actually).Now that we have that, do the following.