When using split(), what regular expression would allow me to keep all word characters but would also preserve contractions like don’t won’t. Anything with word characters on both sides of the apostrophe but removes any leading or trailing apostraphes such as ’tis or dogs’.
I have:
String [] words = line.split("[^\\w'+]+[\\w+('*?)\\w+]");
but it keeps the leading and trailing punctuation.
Input of 'Tis the season, for the children's happiness'.
Would produce an output of: Tis the season for the children's happiness
Any advice?
I would think: split on:
['-]\\W+,or any none word chars
[^\\w'-]\\W*.Here I added
-as addition to apostrophe.Result:
Adding begin and end:
Result:
which for the beginning yields an empty string.