I need to transform some text files into HTML code. I’m stuck in transforming

Question

0

Asked: June 7, 20262026-06-07T04:52:24+00:00 2026-06-07T04:52:24+00:00

I need to transform some text files into HTML code. I’m stuck in transforming

0

I need to transform some text files into HTML code. I’m stuck in transforming a list into an HTML unordered list. Example source:

some text in the document
* item 1
* item 2
* item 3
some other text

The output should be:

some text in the document
<ul>
    <li>item 1</li>
    <li>item 2</li>
    <li>item 3</li>
</ul>
some other text

Currently, I have this:

r = re.compile(r'\*(.*)\n')
r.sub('<li>\1</li>', the_text_document)

which creates an HTML list without < ul > tags.
How can I identify the first and last items and surround them with < ul > tags?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T04:52:25+00:00

After playing with some ideas, I’ve decided to go with a second regex.
So basically, after running the first regex (from my original post, that creates the <li> tags), I run:

r = re.compile(r'(<li>.*?</li>\n(?!\s*<li>))', re.DOTALL)
r.sub('<ul>\\1</ul>', string_with_li_tags)

This will find the first match of <li> tag and the last match of </li>\n combo, not followed by a <li> tag (which essentially means the entire list) and add <ul> tags.

EDIT:
I modified the regex a bit so it won’t be greedy. This way it can handle multiple lists in the same document. Only requirement is that there are no spaces between list items, as @Aprillion mentioned below

EDIT 2:
Modified the negative lookahead to treat spaces between list items as well, so all cases are covered

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to transform some text files into HTML code. I’m stuck in transforming

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply