I am looking at some sort of existing filter which can sanitize the user input to avoid XSS. Probably I can use htmlspecialchars for that. But at the same time I want to be able to parse all links (should match a.com, http://www.a.com and http://www.a.com and if it is http://www.aaaaaaaaaaaaaaaaaaaaaaaaaa.com then it should display it as aaa..a.com), e-mails and smileys.
I am wondering what is the best way to go about it. I am currently using a php function with some regex, but many times the regex simply fails (because of link recognition is incorrect etc.). I want something very similar to the parser used during Google Chat (even a.com works).
Thank you for your time.
For smileys you might want to look at http://www.php.net/manual/en/book.bbcode.php (requires php 5.2.0 or better unless you can install it from PECL)