I have content something like
<div class="c2">
<div class="c3">
<p>...</p>
</div>
</div>
What I want is to match the div.c2’s inner HTML. The contents of it may vary a lot. The only problem I am facing here is that how can I make it to work so that the right closing div is taken?
You can’t. This problem is unsolvable with classic regular expressions, and with most of the existing regex implementations.
However, some regex engines have special support for balanced pair matching. See, e.g., here (.NET). Though even in this case your regex will be able to parse only a subset of syntactically correct texts (e.g., what if a
< /div >is embedded in a comment?). You need an HTML parser to get reliable results.