I need to parse through the aspx file (from disk, and not the one rendered on the browser) and make a list of all the server side asp.net controls present on the page, and then create an xml file from it. which would be the best way to do it? Also, are there any available libraries for this?
For eg, if my aspx file contains
<asp:label ID="lbl1" runat="server" Text="Hi"></asp:label>
my xml file would be
<controls>
<ID>lbl1</ID>
<runat>server</runat>
<Text>Hi</Text>
</controls>
Xml parsers wouldn’t understand the ASP directives: <%@ <%= etc.
You’ll probably best to use regular expressions to do this, likely in 3 stages.
So, starting from the top, we can use the following regex:
This will match any tags that don’t have <% and < / and does so lazily (we don’t want greedy expressions, as we won’t read the content correctly). The following could be matched:
For each of those captured tags, we want to then extract the tag and type:
Creating named capture groups makes this easier, this will allow us to easily extract the tag and type. This will only match server tags, so standard html tags will be dropped at this point.
Will yield:
With that same tag, we can then match any attributes:
Which yields:
So putting that all together, we can create a quick function that can create an XmlDocument for us:
The resultant document could look like this:
Hope that helps!