I am learning Python and trying to figure out an efficient way to tokenize a string of numbers separated by commas into a list. Well formed cases work as I expect, but less well formed cases not so much.
If I have this:
A = '1,2,3,4' B = [int(x) for x in A.split(',')] B results in [1, 2, 3, 4]
which is what I expect, but if the string is something more like
A = '1,,2,3,4,'
if I’m using the same list comprehension expression for B as above, I get an exception. I think I understand why (because some of the ‘x’ string values are not integers), but I’m thinking that there would be a way to parse this still quite elegantly such that tokenization of the string a works a bit more directly like strtok(A,’,\n\t’) would have done when called iteratively in C.
To be clear what I am asking; I am looking for an elegant/efficient/typical way in Python to have all of the following example cases of strings:
A='1,,2,3,\n,4,\n' A='1,2,3,4' A=',1,2,3,4,\t\n' A='\n\t,1,2,3,,4\n'
return with the same list of:
B=[1,2,3,4]
via some sort of compact expression.
How about this:
x.strip() trims whitespace from the string, which will make it empty if the string is all whitespace. An empty string is ‘false’ in a boolean context, so it’s filtered by the if part of the list comprehension.