Consider the following array which holds all US stock tickers, ordered by length:
$tickers = array('AAPL', 'AA', 'BRK.A', 'BRK.B', 'BAE', 'BA'); // etc...
I want to check a string for all possible matches. Tickers are written with or without a “$” concatenated to the front:
$string = "Check out $AAPL and BRK.A, BA and BAE.B - all going up!";
All tickers are to be labeled like: {TICKER:XX}. The expected output would be:
Check out {TICKER:AAPL} and {TICKER:BRK.A} and BAE.B - all going up!
So tickers should be checked against the $tickers array and matched both if they are followed by a space or a comma. Until now, I have been using the following:
preg_replace('/\$([a-zA-Z.]+)/', ' {TICKER:$1} ', $string);
so I didn’t have to check against the $tickers array. It was assumed that all tickers started with “$”, but this only appears to be the convention in about 80% of the cases. Hence, the need for an updated filter.
My question being: is there a simple way to adjust the regex to comply with the new requirement or do I need to write a new function, as I was planning first:
function match_tickers($string) {
foreach ($tickers as $ticker) {
// preg_replace with $
// preg_replace without $
}
}
Or can this be done in one go?
Just make the leading dollar sign optional, using
?(zero or 1 matches). Then you can check for legal trailing characters using the same technique. A better way to go about it would be toexplodeyour input string and check/replace each substring against the ticker collection, then reconstruct the input string.