I found I could generate XDocument object from html by using SgmlReader.SL.
https://bitbucket.org/neuecc/sgmlreader.sl/
The code is like this.
public XDocument Html(TextReader reader)
{
XDocument xml;
using (var sgmlReader = new SgmlReader { DocType = "HTML", CaseFolding = CaseFolding.ToLower, InputStream = reader })
{
xml = XDocument.Load(sgmlReader);
}
return xml;
}
Also we can get src attributes of img tags from the XDocument object.
var ns = xml.Root.Name.Namespace;
var imgQuery = xml.Root.Descendants(ns + "img")
.Select(e => new
{
Link = e.Attribute("src").Value
});
And, we can download and convert stream data of image to BASE64 string.
public static string base64String;
WebClient wc = new WebClient();
wc.OpenReadAsync(new Uri(url)); //image url from src attribute
wc.OpenReadCompleted += new OpenReadCompletedEventHandler(wc_OpenReadCompleted);
void wc_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
byte[] buf = new byte[32768];
int read = e.Result.Read(buf, 0, buf.Length);
if (read > 0)
{
ms.Write(buf, 0, read);
}
else { break; }
}
byte[] imageBytes = ms.ToArray();
base64String = Convert.ToBase64String(imageBytes);
}
}
So, What I’d like to do is bellow steps. I’d like to do bellow steps in one method chain like LINQ or Reactive Extensions.
- Get src attributes of img tags from XDocument object.
- Get image datas from urls.
- Generate BASE64 string from image datas.
- Replace src attributes by BASE64 string.
The simplest source and output are here.
-
Before
<html> <head> </head> <body> <img src='http://image.com/image.jpg' /> <img src='http://image.com/image2.png' /> </body> </html> -
After
<html> <head> </head> <body> <img src='data:image/jpg;base64,iVBORw...' /> <img src='data:image/png;base64,iSDoske...' /> </body> </html>
Does anyone know the solution for this?
I’d like to ask experts.
Both LINQ and Rx are designed to promote transformations that result in new objects, not ones that modify existing objects, but this is still doable. You have already done the first step, breaking the task into parts. The next step is to make composable functions that implement those steps.
1) You mostly have this one already, but we should probably keep the elements around to update later.
2) This seems to be where you have hit a wall from the composability point of view. To start, lets make a
FromEventAsyncPatternobservable generator. There are already ones for the Begin/End async pattern and standard events, so this will come out somewhere in between.Now we can use this method to turn the downloads into observables. Based on your usage, I think you could also use
DownloadDataAsyncon the WebClient instead.EDIT: As per your comment, you appear to be using Silverlight, where
WebClientis notIDisposableand does not have the method I was using. To deal with that, try something like:You will need to find an implementation of
ReadAsyncto read the stream. You should be able to find one pretty easily, and the post was long enough already so I left it out.3 & 4) Now we are ready to put it all together and update the elements. Since step 3 is so simple, I’ll just merge it in with step 4.