I need to call a web page that has javascript. At the bottom of the page I have the following:
<noscript>
<p>Javascript is not supported or enabled.</p>
</noscript>
When I make my HttpWebRequest request like so, it is clear that the javascript on the page did not execute.
Dim req As System.Net.HttpWebRequest = DirectCast(System.Net.WebRequest.Create(New Uri(url)), System.Net.HttpWebRequest)
' Add the current authentication cookie to the request
Dim cookie As HttpCookie = HttpContext.Current.Request.Cookies(FormsAuthentication.FormsCookieName)
Dim authenticationCookie As New System.Net.Cookie(FormsAuthentication.FormsCookieName, cookie.Value, cookie.Path, HttpContext.Current.Request.Url.Authority)
req.CookieContainer = New System.Net.CookieContainer()
req.CookieContainer.Add(authenticationCookie)
req.MediaType = "PRINT"
req.Method = "GET"
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
Dim res As System.Net.WebResponse = req.GetResponse()
What can I do? The response is not useful to me if the javascript did not run. I want to convert the output into a PDF. I guess I need a way to execute the javascript that in included in the response, but do so outside of the browser.
Thanks.
What output do you want to convert? You can only scrape the static HTML, not the JavaScript-modified DOM.
Remember that
HttpWebRequestdoes not interpret JavaScript.