I’ve written some code to import content from my Blogger blog. Once I’ve downloaded all of the HTML content, I go through the image tags and download the corresponding images. In a significant number of cases, System.Drawing.Bitmap.FromStream is throwing an ArgumentException. The URL I’m downloading from looks good and it serves up an image as expected (here’s the URL for one of the problem images: http://4.bp.blogspot.com/_tSWCyhtOc38/SgIPcctWRZI/AAAAAAAAAGg/2LLnVPxsogI/s1600-h/IMG_3590.jpg).
private static System.Drawing.Image DownloadImage(string source)
{
System.Drawing.Image image = null;
// used to fetch content
var client = new HttpClient();
// used to store image data
var memoryStream = new MemoryStream();
try
{
// fetch the image
var imageStream = client.GetStreamAsync(source).Result;
// instantiate a system.drawing.image from the data
image = System.Drawing.Bitmap.FromStream(imageStream, false, false);
// save the image data to a memory stream
image.Save(memoryStream, image.RawFormat);
}
catch (IOException exception)
{
Debug.WriteLine("{0} {1}", exception.Message, source);
}
catch (ArgumentException exception)
{
// sometimes, an image will link to a web page, resulting in this exception
Debug.WriteLine("{0} {1}", exception.Message, source);
}
catch (AggregateException exception)
{
// sometimes, an image src will throw a 404
Debug.WriteLine("{0} {1}", exception.Message, source);
}
finally
{
// clean up our disposable resources
client.Dispose();
memoryStream.Dispose();
}
return image;
}
Any idea why an ArgumentException is getting thrown here?
EDIT: It occurred to me that it could be a proxy issue, so I added the following to my web.config:
<system.net>
<defaultProxy enabled="true" useDefaultCredentials="true">
<proxy usesystemdefault="True" />
</defaultProxy>
</system.net>
Adding that section hasn’t made any difference, however.
EDIT: This code is called from the context of an EF database initializer. Here’s a stack trace:
Web.dll!Web.Models.Initializer.DownloadImage(string source) Line 234 C#
Web.dll!Web.Models.Initializer.DownloadImagesForPost.AnonymousMethod__5(HtmlAgilityPack.HtmlNode tag) Line 126 + 0x8 bytes C#
[External Code]
Web.dll!Web.Models.Initializer.DownloadImagesForPost(Web.Models.Post post) Line 119 + 0x34 bytes C#
Web.dll!Web.Models.Initializer.Seed(Web.Models.FarmersMarketContext context) Line 320 + 0xb bytes C#
[External Code]
App_Web_l2h4tcej.dll!ASP._Page_Views_Home_Index_cshtml.Execute() Line 28 + 0x15 bytes C#
[External Code]
OK, I found the issue. It turns out that, in some cases, Blogger references an HTML page that renders an image rather than referencing the image itself. So, the response in that case isn’t a valid image. I’ve added code to check the response headers before attempting to save the image data and that’s fixed the problem. For the benefit of anyone else who hits this issue, here’s the updated code: