A little background… I’m a little new to javascript, and to phantom.js, so I don’t know if this is a javascript or phantom.js bug (feature?).
The following completes successfully (sorry for the missing phantom.exit(), you’ll just have to ctrl+c once you are done):
var page = require('webpage').create();
var comment = "Hello World";
page.viewportSize = { width: 800, height: 600 };
page.open("http://www.google.com", function (status) {
if (status !== 'success') {
console.log('Unable to load the address!');
phantom.exit();
} else {
page.includeJs('http://code.jquery.com/jquery-latest.min.js', function() {
console.log("1: ", comment);
}, comment);
var foo = page.evaluate(function() {
return arguments[0];
}, comment);
console.log("2: ", foo);
}
});
This works:
page.includeJs('http://code.jquery.com/jquery-latest.min.js', function() {
console.log("1: ", comment);
}, comment);
Output: 1: Hello World
But not:
page.includeJs('http://code.jquery.com/jquery-latest.min.js', function(c) {
console.log("1: ", c);
}, comment);
Output: 1: http://code.jquery.com/jquery-latest.min.js
And not:
page.includeJs('http://code.jquery.com/jquery-latest.min.js', function() {
console.log("1: ", arguments[0]);
}, comment);
Output: 1: http://code.jquery.com/jquery-latest.min.js
Looking at the 2nd piece, this works:
var foo = page.evaluate(function() {
return arguments[0];
}, comment);
console.log("2: ", foo);
Output: 2: Hello World
And this:
var foo = page.evaluate(function(c) {
return c;
}, comment);
console.log("2: ", foo);
Output: 2: Hello World
But not this:
var foo = page.evaluate(function() {
return comment;
}, comment);
console.log("2: ", foo);
Output:
ReferenceError: Can’t find variable: comment
phantomjs://webpage.evaluate():2
phantomjs://webpage.evaluate():3
phantomjs://webpage.evaluate():3
2: null
The good news is, I know what works and what doesn’t, but how about a little consistency?
Why the difference between includeJs and evaluate?
Which is the proper way to pass arguments to an anonymous function?
The tricky thing to understand with PhantomJS is that there are two execution contexts – the Phantom context, which is local to your machine and has access to the
phantomobject andrequired modules, and the remote context, which exists within thewindowof the headless browser and only has access to things loaded in webpages you load viapage.load.Most of the script you write is executed in the Phantom context. The main exception is anything within
page.evaluate(function() { ... }). The...here is executed in the remote context, which is sandboxed, without access to the variables and objects in your local context. You can move data between the two contexts by:page.evaluate(), orThe values thus passed are essentially serialized in each direction – you can’t pass a complex object with methods, only a data object like a string or an array (I don’t know the exact implementation, but the rule of thumb seems to be that anything you can serialize with JSON can be passed in either direction). You do not have access to variables outside the
page.evaluate()function, as you would with standard Javascript, only to variables you explicitly pass in as arguments.So, your question: Why the difference between includeJs and evaluate?
.includeJs(url, callback)takes a callback function that executes within the Phantom context, apparently receiving the url as its first argument. In addition to its arguments, it has access (like any normal JavaScript function) to all variables in its enclosing scope, includingcommentin your example. It does not take an additional argument list after the callback function – when you referencecommentwithin the callback, you’re referencing an outside variable, not a function argument..evaluate(function, args*)takes a function to execute and zero or more arguments to pass to it (in some serialized form). You need to name the arguments in the function signature, e.g.function(a,b,c), or use theargumentsobject to access them – they won’t automagically have the same names as the variables you pass in.So the correct way to pass arguments in is different for the functions in these different methods. For
injectJs, the callback will be called with a new set of arguments (including, at least, the URL), so any variables you want to access need to be in the callback’s enclosing scope (i.e. you have access to them within the function’s closure). Forevaluate, there is only one way to pass in arguments, which is to include them in the arguments passed toevaluateitself (there are other ways, too, but they’re tricky and not worth discussing now that this feature is available in PhantomJS itself).