I don’t know if someone can help me, but i’ll ask anyway. I’m creating a function like the php token_get_all written in javascript. This function should “tokenize” a given php code, but i have some problems with whitespaces.
Executing the token_get_all function in php i see that only some whitespaces are considered tokens, the other ones are ignored.
Can someone explain me how this function behaves with whitespaces? Have you ever found some documentation about it?
UPDATE
<?php
if ($var == 0)
{
?>
- Beetween php and if: ignored
- Beetween if and (: tokenized
- Beetween $var and =: tokenized
- Beetween = and 0: tokenized
- Beetween ) and {: tokenized
- Beetween { and ?>: tokenized
I’ve found the solution. Generally whitespaces are ignored after the php open tags:
<?php,<?but not<?=UPDATE
It has taken 2 hours, but i’ve understood the behaviour:).
<?phpand<?get also the following space char or new line char (preceeded by \r or not). The rest of the whitespaces are parsed in other tokens but grouped if they follow the first whitespace. Let me explain better with your examples:Tokens: "
<?php","echo"….Tokens: "
<?php"," (remaining whitespaces)","echo"…Another example with new lines:
Tokens: "
<?php\n","echo"….Tokens: "
<?php\n","\n\n(remaining new lines)","echo"….I’ve tested it all the day so i’m sure that it behaves like this.