I have a program which needs to turn many large one-dimensional numpy arrays of floats into delimited strings. I am finding this operation quite slow relative to the mathematical operations in my program and am wondering if there is a way to speed it up. For example, consider the following loop, which takes 100,000 random numbers in a numpy array and joins each array into a comma-delimited string.
import numpy as np
x = np.random.randn(100000)
for i in range(100):
",".join(map(str, x))
This loop takes about 20 seconds to complete (total, not each cycle). In contrast, consider that 100 cycles of something like elementwise multiplication (x*x) would take than one 1/10 of a second to complete. Clearly the string join operation creates a large performance bottleneck; in my actual application it will dominate total runtime. This makes me wonder, is there a faster way than “,”.join(map(str, x))? Since map() is where almost all the processing time occurs, this comes down to the question of whether there a faster to way convert a very large number of numbers to strings.
Very good writeup on the performance of various string concatenation techniques in Python: http://www.skymind.com/~ocrow/python_string/
I’m a little surprised that some of the latter approaches perform as well as they do, but looks like you can certainly find something there that will work better for you than what you’re doing there.
Fastest method mentioned on the site