I’m extracting some data from a text document organized like this:
- "day 1"
- "Person 1"
- "Bill 1"
- "Person 2"
- "Bill 2"
I can read this into a list of tuples that looks like this:
[(0,["day 1"]),(1,["Person 1"]),(2,["Bill 1"]),(1,["Person 2"]),(2,["Bill 2"])]
Where the first item of each tuple indicates the heading level, and the second item the information associated with each heading.
My question is, how can I get a list of items that looks like this:
[["day 1","Person 1","Bill 1"],["day 1","Person 2","Bill 2"]]
I.e. one list per deepest nested item, containing all the information from the headings above it.
The closest I’ve gotten is this:
f [] = []
f (x:xs) = row:f rest where
leaves = takeWhile (\i -> fst i > fst x) xs
rest = dropWhile (\i -> fst i > fst x) xs
row = concat $ map (\i -> (snd x):[snd i]) leaves
Which gives me this:
[[["day 1"],["Intro 1"],["day 1"],["Bill 1"],["day 1"],["Intro 2"],["day 1"],["Bill 2"]]]
I’d like the solution to work for any number of levels.
P.s. I’m new to Haskell. I have a sense that I could/should use a tree to store the data, but I can’t wrap my head around it. I also could not think of a better title.
I seem to have solved it.
I didn’t test it much though.
The idea is to notice the recursive pattern. This function takes the first element (N, S) of the list and then gathers all entries in higher levels until another element at level N, into a list ‘children’. If there are no children, we are at the top level and S forms the output. If there are some, S is appended to all of them.
As for why your algorithm doesn’t work, the problem is mostly in
row. Notice that you are not descending recursively.Trees can be used too.
The algorithm is essentially the same. The first half goes to the first function, the second half to the second.