I am writing a bash script and using wget to retrieve some PDF files form a website. For example:
wget www.barb.co.uk/news/item-subscriber/id/213/index.html
But wget saves the file as index.html. If I am in a browser and enter that URL, it correctly downloads the file with it’s real name – “BARB Bulletin 25 – December 10.pdf”.
How can I get wget to do the same? Or is there another way I can find the real name of the file (from within a bash script)?
You can use the
--content-dispositionoption to make wget have a more sophisticated look into the headers of the HTTP response, which helps in most cases.Example: