I found that # character in URL makes wget behaving differently from my expectation. Essentially url string up to # will be kept and the substring from # will be discarded. I guess that’s because # is in-page navigation link? But obviously certain sites seems to be using it as “?” (beginning of url parameters). Any solution to work this around with wget? I tried curl but no luck.
I found that # character in URL makes wget behaving differently from my expectation.
Share
Not sure if this will help you, but I am presuming that you are using the hashtag (#) for ajax. If that is the case, using wget is pointless because it will not be able to execute the JavaScript. So if any content that is normally been generated with JavaScript will be missing.
If you want to download the contents of a webpage, with the JavaScript executed, then you need what is called a ‘headless browser’. Check these out:
htmlunit
phantomjs
zombiejs