maybe I’m just misunderstanding Javascript’s regular expression functionality but here goes… I have an array with expressions I want to remove, and I go about it this way:
var removeThese = ['inc\\.','inc','ltd\\.','ltd','\\(c\\)'];
for(var i=0; i < removeThese.length; i++) {
var find = removeThese[i];
regex = new RegExp('\\b'+find+'\\b','gi');
titletext = titletext.replace(regex,'');
}
So, in the above I expect any island (full word) expressions of inc.,inc,ltd.,ltd or (c) to be matched. My console on console.log(regex):
/\binc.\b/gi
/\binc\b/gi
/\bltd\.\b/gi
/\bltd\b/gi
/\b\(c\)\b/gi
Looks pretty good right? But it’s completely missing any occurances of (c) and when it replaces inc. it leaves the ‘.’, so
This is a title (c) inc.
Becomes
This is a title (c) .
What am I missing here?
note, I would use a reg exp like ‘(inc\.)|(inc)|(ltd\.)…’ but I have some items in that array that need special conversion (like 169 is converted to the © symbol before being searched for.
(and)are not considered word characters, so there is no word boundary between whitespace and a(. That means that your\bwon’t match there.You could change it to something like:
Which will remove the word if it is either at the start of the string, or is preceded by some spaces, and at the end of the string, or followed by some spaces. It will also remove the spaces before the string so
word (c) word2will becomeword_word2instead ofword__word2(Spaces marked by underscores for clarity).