I am scraping an online forum using Jsoup. Was wondering how should i go about scraping the main post without other’s commenter’s quote.
What i managed to scrape: carey wrote: Yup, CC usually got discounts, especially for petrol and makan… The black DBS debit card when used at petrol kiosk can get discount? I always pay cash because no cc.
What i want: The black DBS debit card when used at petrol kiosk can get discount? I always pay cash because no cc.
Here is the html:
<div id="post_message_63989045">
<div class="quote">
<span class="byline"> <a href="/eat-drink-man-woman-16/life-without-credit-cards-3601620-post63982949.html#post63982949" rel="nofollow"><img class="inlineimg" src="http://www.hardwarezone.com.sg/img/forums/hwz/buttons/viewpost.gif" border="0" alt="View Post" /></a> <strong>carey</strong> wrote: </span>
<blockquote cite="showthread.php?p=63982949#post63982949">
Yup, CC usually got discounts, especially for petrol and makan...
<br />
<br /> So those without a CC are being penalized
<img src="http://www.hardwarezone.com.sg/img/forums/hwz/smilies/eek.gif" border="0" alt="" title="EEK!" class="inlineimg" />
</blockquote>
</div>The black DBS debit card when used at petrol kiosk can get discount ?
<br />
<br /> I always pay cash because no cc .
<img src="http://www.hardwarezone.com.sg/img/forums/hwz/smilies/frown.gif" border="0" alt="" title="Frown" class="inlineimg" />
</div>
comments.ownText()
Gets text that is owned by element. does not combine text with all children