I have a column of data that contains strings, and I want to create

Question

0

Asked: June 11, 20262026-06-11T06:46:27+00:00 2026-06-11T06:46:27+00:00

I have a column of data that contains strings, and I want to create

0

I have a column of data that contains strings, and I want to create a new column that takes only the first two characters from the corresponding data string.

It seems logical to use the apply function for this, but it doesn’t work like expected. It does not even seem to be consistent with other uses of apply. See below.

In [205]: dfrm_test = pandas.DataFrame({"A":np.repeat("the", 10)})

In [206]: dfrm_test
Out[206]:
     A
0  the
1  the
2  the
3  the
4  the
5  the
6  the
7  the
8  the
9  the

In [207]: dfrm_test["A"].apply(lambda x: x+" cat")
Out[207]:
0    the cat
1    the cat
2    the cat
3    the cat
4    the cat
5    the cat
6    the cat
7    the cat
8    the cat
9    the cat
Name: A

In [208]: dfrm_test["A"].apply(lambda x: x[0:2])
Out[208]:
0    the
1    the
Name: A

Based on this, it appears that apply does nothing but perform the NumPy equivalent of whatever is called inside. That is, apply seems to execute the same thing as arr + " cat" in the first example. And if NumPy happens to broadcast that, then it will work. If not, then it won’t.

But this seems to break from what apply promises in the docs. Below is the quotation for what pandas.Series.apply should expect:

Invoke function on values of Series. Can be ufunc or Python function expecting only single values (link)

It says explicitly that it can accept Python functions expecting only single values. And the function that’s not working (lambda x: x[0:2]) definitely satisfies that. It doesn’t say that the single argument must be an array. And given that things like numpy.sqrt are commonly used for single inputs (so not exclusively arrays), it seems natural to expect Pandas to work with any such function.

Is there some way of using apply that I am missing here?

Note: I did write my own extra function below:

def ix2(arr):
    return np.asarray([x[0:2] for x in arr])

and I verified that this version does work with Pandas apply. But this is beside the point. It would be easier to write something that operated externally on top of a Series object than to have to constantly write wrappers that use list comprehensions to effectively loop over the contents of the Series. Isn’t this specifically what apply is supposed to abstract away from the user?

I am using Pandas version 0.7.3, and it is on a workplace shared network, so there’s no way to upgrade to the recent release.

Added:

I was able to confirm that this behavior changes from version 0.7.3 to version 0.8.1. In 0.8.1 it works as expected with no NumPy ufunc wrapper.

My guess is that in the code, someone was trying to use numpy.vectorize or numpy.frompyfunc within a try-except statement. Perhaps it did not work correctly with the particular lambda function I am using, and so in the except part of the code, it defaulted to just relying on generic NumPy broadcasting.

It would be great to get some confirmation on this from a Pandas developer, if possible. But in the meantime, the ufunc workaround should suffice.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T06:46:29+00:00

One workaround I can think of would be converting the Python function to numpy.ufunc with numpy.frompyfunc:

numpy.frompyfunc((lambda x: x[0:2]), 1, 1)

and use this in apply:

In [50]: dfrm_test
Out[50]:
     A
0  the
1  the
2  the
3  the
4  the
5  the
6  the
7  the
8  the
9  the

In [51]: dfrm_test["A"].apply(np.frompyfunc((lambda x: x[0:2]), 1, 1))
Out[51]:
0    th
1    th
2    th
3    th
4    th
5    th
6    th
7    th
8    th
9    th
Name: A

In [52]: pandas.version.version
Out[52]: '0.7.3'

In [53]: dfrm_test["A"].apply(lambda x: x[0:2])
Out[53]:
0    the
1    the
Name: A

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a column of data that contains strings, and I want to create

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply