I want to tokenize text in my database with RegEx and store the resulting tokens in a table. First I want to split the words by spaces, and then each token by punctuation.
I’m doing this in my application, but executing it in the database might speed it up.
Is it possible to do this?
There is a number of functions for tasks like that.
To retrieve the 2nd word of a text:
Split the whole text and return one word per row:
Actually, the last example splits on any stretch of whitespace.)