UPDATE AT THE BOTTOM Maybe somebody could help with this… been struggling with it

Question

0

Editorial Team

Asked: May 31, 20262026-05-31T19:31:41+00:00 2026-05-31T19:31:41+00:00

UPDATE AT THE BOTTOM Maybe somebody could help with this… been struggling with it

0

UPDATE AT THE BOTTOM

Maybe somebody could help with this… been struggling with it for days and i’m blocked :/

For a content-cleaner solution i’m working in, i’m trying to convert some pure-text numbered lists, like:

1 Foo
1.1 Foo 1
1.2 Foo 2
2 Bar
2.1 Bar 1
2.2 Bar 2
2.2.1 Bar 2.1
2.2.2 Bar 2.2
2.3 Bar 3
3 Z Another root item

… into correct nested html lists …

<ul>
    <li>Foo
        <ul>
            <li>Foo 1</li>
            <li>Foo 2</li>
        </ul>
    </li>
    <li>Bar
        <ul>
            <li>Bar 1</li>
            <li>Bar 2
                <ul>
                    <li>Bar 2.1</li>
                    <li>Bar 2.2</li>
                </ul>
            </li>
            <li>Bar 3</li>
        </ul>
    <li>Another root item</li>
</ul>

Some things that may help:

No need for the result to be correctly indented, just surrounded by the correct html tags
No need to locate the list inside another text, can sume i already have only the list
No need for great performance, regexp, itaration… whatever works is fine
No need for especific language solution, PHP, Python, Javascript, Pseudocode… is fine
Can asume ” ” (space) as the only separator after the “1.2.3 ” list text
Can asume lines are already in the correct order, no need to order them at all

UPDATE TLTR (Not homework, but real world usage)

Sorry for looking so “homework not done”, my fault. English is not my language and i tried to be maybe to concise.
What i’m trying to do is to make it easier for my workmates to format text to correct html from unknow sources.

Up to day i managed to (you can see the full screenshot here http://twitpic.com/907aw5/ as i can’t attach images being my first question and no reputation):

I get the original text and do a strip_tags on it to delete any incorrect HTML it can have
I insert it into a textarea
I integrated a Javascript editor ( Codemirror http://codemirror.net ) with the specifications for HTML
I injected an edition bar with the most common tags we use, as my workmates doesn’t know a word about HTML
As part of the cleaning options, i set two hotkeys that makes an ul / ol of the selected text (breaking in the \n chars)
When the user saves, i run HTMLTidy on it for it to became as cleaner as posible (indent, delete propietary tags, etc…)

Just to finish, as you can see in the above screenshot, i have a lot of texts with the 1.2.3 “organization”, and it will be of much help to be able to get a nested list solution out of this kind of text.

UPDATE (The especific needs)

Now the explanation of “why” i used so many bullets for asumptions:

No need for the result to be correctly indented, just surrounded by the correct html tags (Because after this, when the user hit Save button, i run htmltidy on it, so it get indented)
No need to locate the list inside another text, can sume i already have only the list (Because i run the code over the user-selected text in the editor, so i can sume he selected the correct list)
No need for great performance, regexp, itaration… whatever works is fine (As it an human-use, point-click, point-click, i don’t mind if it takes 0.0001 seconds per use, or 0.1)
No need for especific language solution, PHP, Python, Javascript, Pseudocode… is fine (I intend to use it in javascript/jQuery, but what i need is just the logic, as i’m blocked… i can tarnslate it if the solution is in another language)
Can asume ” ” (space) as the only separator after the “1.2.3 ” list text (As it is the 99% of my text-cases)
Can asume lines are already in the correct order, no need to order them at all (As you can see in the screenshot, that text is human-entered, and i asume they inserted it in the correct order)

Sorry again for not being clear enought, just my first question in Stackoverflow, and i didn’t realize it will look like homework, my fault.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T19:31:42+00:00

Just for funsies, I went ahead and wrote a solution to your problem using PHP:

function helper_func($m)
{
    static $r=0;
    $o='';
    $l=preg_match_all("#\d+#",$m[1],$n);
    while($l < $r)
    {
        $r--;
        $o .= '</li></ul>';
    }
    if($l == $r)return $l == 0?$o.$m[0]:$o.'</li><li>'.$m[0];
    else $o=$m[0];
    while($l > $r)
    {
        $r++;
        $o = '<ul><li>'.$o;
    }
    return $o;
}
echo preg_replace_callback("#^([0-9.]*).*$#m","helper_func",$input);

However, in deference to this being homework, I included a deliberate error: for it to come out correctly, you need to make a single small change to $input before passing it in… Have fun 🙂

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

UPDATE AT THE BOTTOM Maybe somebody could help with this… been struggling with it

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply