I’m trying to populate a grid with some data extracted from linkedin, im just trying to get it working for my own learning curve, BUT if I remove the line
MessageBox.Show("asdfasdfasdf")
the list “messages” only has 1 item, if I include the line above it does whats expected and I get 15 messages
Can someone explain?
public void extract_messages_received(object sender, RoutedEventArgs e)
{
triggered = false;
System.Windows.Forms.WebBrowser browser = new System.Windows.Forms.WebBrowser();
browser.Navigate(new Uri(@"http://www.linkedin.com/inbox/messages/received"));
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
}
private void LoadMessages(string url)
{
txtOutput.Text = @"http://www.linkedin.com" + url.Substring(6, url.Length - 6);
if (!urls.Contains(url))
{
urls.Add(url);
WebBrowser browser = new WebBrowser();
browser.Navigate(new Uri(txtOutput.Text);
loaded_message = false;
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(ReadMessages);
}
}
private void ReadMessages(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (loaded_message == false)
{
string url = ((WebBrowser)sender).Url.ToString();
int loc1 = url.IndexOf("itemID") + 7;
int loc2 = url.IndexOf("&", loc1);
IEnumerable<string> name = null;
IEnumerable<string> odate = null;
IEnumerable<string> photo = null;
IEnumerable<string> subject = null;
IEnumerable<string> headline = null;
string body = "";
string id = url.Substring(loc1, loc2 - loc1);
//System.Windows.MessageBox.Show("READ");
foreach (HtmlElement element in ((WebBrowser)sender).Document.GetElementsByTagName("div"))
{
if (element.GetAttribute("classname").Equals("inbox-item-body"))
{
body = element.InnerText;
}
if (element.GetAttribute("classname").Equals("inbox-item-header"))
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(element.InnerHtml);
name = from foo in doc.DocumentNode.SelectNodes("//a[@class='fn']") select foo.InnerText;
odate = from foo in doc.DocumentNode.SelectNodes("//p[@class='date']") select foo.InnerText;
photo = from foo in doc.DocumentNode.SelectNodes("//img[@class='photo']") select foo.Attributes["src"].Value;
subject = from foo in doc.DocumentNode.SelectNodes("//h3") select foo.InnerText;
headline = from foo in doc.DocumentNode.SelectNodes("//span[@class='headline']") select foo.InnerText;
}
}
// ****
MessageBox.Show("asdfasdfasdf");
// ****
messages.Add(new Messages()
{
ID = id,
Subject = subject.First().ToString(),
Headline = headline.First().ToString(),
Sender = name.First().ToString(),
Photo = photo.First().ToString(),
SendDate = odate.First().ToString(),
Body = body
});
// dataMessages.ItemsSource = messages;
}
loaded_message = true;
}
void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (!triggered)
{
triggered = true;
System.Windows.Forms.WebBrowser web = sender as System.Windows.Forms.WebBrowser;
foreach (HtmlElement element in web.Document.GetElementsByTagName("ol"))
{
if (element.GetAttribute("classname").Contains("inbox-list "))
{
WebBrowser browser = new WebBrowser();
browser.Navigate("about:blank");
browser.Document.Write(element.InnerHtml);
HtmlElementCollection hrefTags = null;
hrefTags = browser.Document.GetElementsByTagName("a");
foreach (HtmlElement a in hrefTags)
{
if (a.OuterHtml.Contains("displayMBox"))
{
LoadMessages(a.GetAttribute("href"));
}
}
}
}
}
}
This is a timing issue.
When you have the message box in there,
loaded_messagedoesn’t get set totrueuntil after you close the message box, so the other events are processing up until the message box as well, with none of them settingsloaded_messageto true until you close the first message box.If you close the messagebox quickly enough, you will probably see some number beteween 1 and 15.
Let’s take a more simplistic example:
Now, if you watch the console, you will see a few
falseshow up before the first message box is shown. When I close the message box, I then see 4 more messageboxes because those were already queued up and waiting to show beforeshowngot set to true. If I comment out the messagebox, then I get only one message box shown and onefalsein the console.Now, the question becomes, why did you add and need to check the
loaded_messageboolean variable.My guess is that you only want to load each message only once. If that is the case, keep track of each URL in a dictionary and maintain a bool for each URL:
I left
shownin there to demonstrate that this new approach now works for each pass in the document completed event. Your output window should have afalsefollowed by 4true.