I’d like to convert an unordered list, which is stored as a string into a JSON array.
The reason I need this is because I’m screen scraping a website (with permission) so all I’ve got is website source stored as a string (yes, it’s horrible) until they finish their API (and yes, they’ve agreed not to change any of their HTML in the process). 🙂
HTML:
<ul class="column">
<li><a href="/view.php?m=48902&g=313433">Item 1</a></li>
<li><a href="/view.php?m=09844&g=313433">Item 2</a></li>
<li><a href="/view.php?m=23473&g=313433">Item 3</a></li>
</ul>
JSON:
{"items":[
{
id: 1,
url: "/view.php?m=48902&g=313433",
name: "Item 1",
m: 48902,
g: 313433
},
{
id: 2,
url: "/view.php?m=09844&g=313433",
name: "Item 2",
m: 09844,
g: 313433
},
{
id: 3,
url: "/view.php?m=23473&g=313433",
name: "Item 3",
m: 23473,
g: 313433
}
]}
Proposed approach:
Since you will be parsing HTML extensively, I recommend that you download HTMLAgilityPack and use it to parse your HTML. There is some sample code in the website. It also supports LINQ, so parsing the HTML should be relatively easy.
As far as converting to JSON, my advise is that you create a class with the structure you want; for example:
Now that you have the structure ready as a class, you can build a
List<MyItem>with all the elements you parsed from your HTML.The final step to convert to JSON is a matter of doing: