I know that most javascript email obfuscation solutions stop bots dead in their tracks – but sometimes it’s hard to use/insert javascript in places.
To that end I was wondering if anyone knew if the bots were smart enough to translate HTML entities in HEX and DEC into valid email strings?
For example, lets say I have a function that randomly converts the string characters into one of three forms – is this enough?
hide_email($email)
{
$s='';
foreach(str_split($email)as$l)
{
switch(rand(1,3))
{
case 1:$s.='&#'.ord($l).';';break;
case 2:$s.='&#x'.dechex(ord($l)).';';break;
case 3:$s.=$l;
}
}
return$s;
}
which makes first.last@email.com into something like:
first.last@email.com
I would assume that the bot creators would have already added a regex pattern for something like this this…
I would not think this particularly safe. Were I writing code to interpret HTML, decoding entities to their corresponding characters would be among the first bits of code to go in.
As a further defense, I would suggest judicious use of tags (such as the
<span>tag), perhaps even nested. That takes more effort to decode and still does not require Javascript.