I’m writing a browser add-on (Chrome and Firefox) which, in a nutshell, does “some calculations” based upon the content of each page the user visits (i.e. we’re not talking about the user’s own pages), presenting the user with a score. In the case of particularly high scores, the page title, page URL, etc. is submitted (automatically) to a central service so that a kind of league table can be shown.
All of this works perfectly, but I want – where possible – to strip out all traffic from the pages of Intranets that our users happen to visit. (We don’t store or transmit any page content, nonetheless there are privacy concerns, and we don’t want anything to do with internal / corporate documents.)
In theory I can work out (to a reasonable degree of accuracy) whether an IP is likely to be from an intranet @ Distinguish the between intranet and official IP addresses, but as the DOM doesn’t provide access to the document’s host IP, is it practical to try to determine the IP on the fly and then apply those IP rules, given the possibility that lookup services might be down/slow?
Would a simpler alternative – like pattern-matching for the TLD of the document’s hostname – be nearly as good?
Any suggestions?
Update:
I was about to answer this myself with “I’ll just do the IP check on the server, when the page stats are submitted, and only complete the submission if the IP is not within the internal range – it’s much easier.” … unfortunately I can’t do this: because my back-end is Google AppEngine [Java] and the InetAddress class is restricted, I can’t do arbitrary IP lookups.
You should use nsIDNSService to resolve the host name in Firefox. Along these lines:
This is usually a very fast operation because the host name is already cached. Note: if you use the Add-on SDK then you will have to use chrome authority and replace access to
Componentsproperties by the respective aliases.As to Chrome, I doubt that you can solve this problem properly there. As Korvin Szanto notes in his comment, any host name could point to a local address. And Chrome won’t let you get the IP address that it talks to.