I have a regex which will split my string into arrays.
Everyything works fine except that I would like to keep a part of the delimiter.
Here is my regex:
(&#?[a-zA-Z0-9]+;)[\s]
in Javascript, I am doing:
var test = paragraph.split(/(&#?[a-zA-Z0-9]+;)[\s]/g);
My paragraph is as followed:
Current addresses: † Biopharmaceutical Research and Development<br />
‡ Clovis Oncology<br />
§ Pisces Molecular <br />
|| School of Biological Sciences
¶ Department of Chemistry<br />
The problem is that I am getting 10 elements in my array and not 5 as I should. In fact, I am also getting my delimiter as an element and my goal is to keep the delimiter with the splited element and not to create a new one.
Thank you very much for your help.
EDIT:
I would like to get this as a result:
1. † Biopharmaceutical Research and Development<br />
2. ‡ Clovis Oncology<br />
3. § § Pisces Molecular <br />
|| School of Biological Sciences
4. ¶ Department of Chemistry<br />
Try to use
matchinstead:Updated: Added a required white-space
\smatch.Explanation:
&#?Match&and an optional#(the question mark match previous one or zero times)[a-zA-Z0-9]is a range of all upper and lower case characters and digits. If you also accept an underscore you could replace this with\w.The
+sign means that it should match the last pattern one or more times, so it matches one or more characters a-z, A-Z and digits 0-9.The
;matches the character;.The
\smatches the class white-space. That includes space, tab and other white-space characters.[^&]*Once again a range, but since^is the first character the match is negated, so instead of matching the&-characters it matches everything but the&. The star matches the pattern zero or more times.gat the end, after the last/meansglobal, and makes thematchcontinue after the first match and get an array of all matches.So, match
&and an optional#, followed by any number of letters or digits (but at least one), followed by;, followed by a white-space, followed by zero or more characters that isn’t&.