I have a string which contains the text of an article. This is sprinkled with BBCodes (between square brackets). I need to be able to grab the first say, 200 characters of an article without cutting it off in the middle of a bbcode. So I need an index where it is safe to cut it off. This will give me the article summary.
- The summary must be minimum 200 characters but can be longer to ‘escape’ out of a bbcode. (this length value will actually be a parameter to a function).
- It must not give me a point inside a stand alone bbcode (see the pipe) like so: [lis|t].
- It must not give me a point between a start and end bbcode like so: [url=”http://www.google.com”%5DGo To Goo|gle[/url].
- It must not give me a point inside either the start or end bbcode or in-between them, in the above example.
It should give me the “safe” index which is after 200 and is not cutting off any BBCodes.
Hope this makes sense. I have been struggling with this for a while. My regex skills are only moderate. Thanks for any help!
First off, I would suggest considering what you will do with a post that is entirely wrapped in BBcodes, as is often true in the case of a font tag. In other words, a solution to the problem as stated will easily lead to ‘summaries’ containing the entire article. It may be more valuable to identify which tags are still open and append the necessary BBcodes to close them. Of course in cases of a link, it will require additional work to ensure you don’t break it.