I’m having a hard time extracting elements between a / and a black space

Question

0

Asked: June 1, 20262026-06-01T06:14:01+00:00 2026-06-01T06:14:01+00:00

I’m having a hard time extracting elements between a / and a black space

0

I’m having a hard time extracting elements between a / and a black space. I can do this when I have two characters like < and > for instance but the space is throwing me. I’d like the most efficient way to do this in base R as This will be lapplied to thousands of vectors.

I’d like to turn this:

x <- "This/DT is/VBZ a/DT short/JJ sentence/NN consisting/VBG of/IN some/DT nouns,/JJ verbs,/NNS and/CC adjectives./VBG"

This:

 [1] "DT"  "VBZ" "DT"  "JJ"  "NN"  "VBG" "IN"  "DT"  "JJ"  "NNS" "CC"  "VBG"

EDIT:

Thank you all for the answers. I’m going for speed so Andres code wins out. Dwin’s code wins for the shotest amount of code. Dirk yours was the second fastest. The stringr solution was the slowest (I figured it would be) and wasn’t in base but is pretty understandable (which really is the intent of the stringr package I think as this seems to be Hadley’s philosophy with most things.

I appreciate your assistance. Thanks again.

I thought I’d include the benchmarking since this will be lapplied over several thousand vectors:

    test replications elapsed relative user.self sys.self
1 ANDRES        10000    1.06 1.000000      1.05        0
3   DIRK        10000    1.29 1.216981      1.20        0
2   DWIN        10000    1.56 1.471698      1.43        0
4 FLODEL        10000    8.46 7.981132      7.70        0

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T06:14:03+00:00

Editorial Team

2026-06-01T06:14:03+00:00Added an answer on June 1, 2026 at 6:14 am

Similar but a bit more succinct:

#1- Separate the elements by the blank space

    y=unlist(strsplit(x,' '))

#2- extract just what you want from each element:

    sub('^.*/([^ ]+).*$','\\1',y)

Where beginning and end anchor characters
are ^ and $ respectively, .* matches any character.
[^ ]+ takes the nonblank characters.
\\1 is the first tagged character

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m having a hard time extracting elements between a / and a black space

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply