Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 274549
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T00:37:37+00:00 2026-05-12T00:37:37+00:00

See updated input and output data at Edit-1. What I am trying to accomplish

  • 0

See updated input and output data at Edit-1.

What I am trying to accomplish is turning

+ 1
 + 1.1
  + 1.1.1
   - 1.1.1.1
   - 1.1.1.2
 + 1.2
  - 1.2.1
  - 1.2.2
 - 1.3
+ 2
- 3

into a python data structure such as

[{'1': [{'1.1': {'1.1.1': ['1.1.1.1', '1.1.1.2']}, '1.2': ['1.2.1', '1.2.2']}, '1.3'], '2': {}}, ['3',]]

I’ve looked at many different wiki markup languages, markdown, restructured text, etc but they are all extremely complicated for me to understand how it works since they must cover a large amount of tags and syntax (I would only need the “list” parts of most of these but converted to python instead of html of course.)

I’ve also taken a look at tokenizers, lexers and parsers but again they are much more complicated than I need and that I can understand.

I have no idea where to begin and would appreciate any help possible on this subject. Thanks

Edit-1: Yes the character at the beginning of the line matters, from the required output from before and now it could be seen that the * denotes a root node with children, the + has children and the – has no children (root or otherwise) and is just extra information pertaining to that node. The * is not important and can be interchanged with + (I can get root status other ways.)

Therefore the new requirement would be using only * to denote a node with or without children and – cannot have children. I’ve also changed it so the key isn’t the text after the * since that will no doubt changer later to an actual title.

For example

* 1
 * 1.1
 * 1.2
  - Note for 1.2
* 2
* 3
- Note for root

would give

[{'title': '1', 'children': [{'title': '1.1', 'children': []}, {'title': '1.2', 'children': []}]}, {'title': '2', 'children': [], 'notes': ['Note for 1.2', ]}, {'title': '3', 'children': []}, 'Note for root']

Or if you have another idea to represent the outline in python then bring it forward.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T00:37:37+00:00Added an answer on May 12, 2026 at 12:37 am

    Edit: thanks to the clarification and change in the spec I’ve edited my code, still using an explicit Node class as an intermediate step for clarity — the logic is to turn the list of lines into a list of nodes, then turn that list of nodes into a tree (by using their indent attribute appropriately), then print that tree in a readable form (this is just a “debug-help” step, to check the tree is well constructed, and can of course get commented out in the final version of the script — which, just as of course, will take the lines from a file rather than having them hardcoded for debugging!-), finally build the desired Python structure and print it. Here’s the code, and as we’ll see after that the result is almost as the OP specifies with one exception — but, the code first:

    import sys
    
    class Node(object):
      def __init__(self, title, indent):
        self.title = title
        self.indent = indent
        self.children = []
        self.notes = []
        self.parent = None
      def __repr__(self):
        return 'Node(%s, %s, %r, %s)' % (
            self.indent, self.parent, self.title, self.notes)
      def aspython(self):
        result = dict(title=self.title, children=topython(self.children))
        if self.notes:
          result['notes'] = self.notes
        return result
    
    def print_tree(node):
      print ' ' * node.indent, node.title
      for subnode in node.children:
        print_tree(subnode)
      for note in node.notes:
        print ' ' * node.indent, 'Note:', note
    
    def topython(nodelist):
      return [node.aspython() for node in nodelist]
    
    def lines_to_tree(lines):
      nodes = []
      for line in lines:
        indent = len(line) - len(line.lstrip())
        marker, body = line.strip().split(None, 1)
        if marker == '*':
          nodes.append(Node(body, indent))
        elif marker == '-':
          nodes[-1].notes.append(body)
        else:
          print>>sys.stderr, "Invalid marker %r" % marker
    
      tree = Node('', -1)
      curr = tree
      for node in nodes:
        while node.indent <= curr.indent:
          curr = curr.parent
        node.parent = curr
        curr.children.append(node)
        curr = node
    
      return tree
    
    
    data = """\
    * 1
     * 1.1
     * 1.2
      - Note for 1.2
    * 2
    * 3
    - Note for root
    """.splitlines()
    
    def main():
      tree = lines_to_tree(data)
      print_tree(tree)
      print
      alist = topython(tree.children)
      print alist
    
    if __name__ == '__main__':
      main()
    

    When run, this emits:

     1
      1.1
      1.2
      Note: 1.2
     2
     3
     Note: 3
    
    [{'children': [{'children': [], 'title': '1.1'}, {'notes': ['Note for 1.2'], 'children': [], 'title': '1.2'}], 'title': '1'}, {'children': [], 'title': '2'}, {'notes': ['Note for root'], 'children': [], 'title': '3'}]
    

    Apart from the ordering of keys (which is immaterial and not guaranteed in a dict, of course), this is almost as requested — except that here all notes appear as dict entries with a key of notes and a value that’s a list of strings (but the notes entry is omitted if the list would be empty, roughly as done in the example in the question).

    In the current version of the question, how to represent the notes is slightly unclear; one note appears as a stand-alone string, others as entries whose value is a string (instead of a list of strings as I’m using). It’s not clear what’s supposed to imply that the note must appear as a stand-alone string in one case and as a dict entry in all others, so this scheme I’m using is more regular; and if a note (if any) is a single string rather than a list, would that mean it’s an error if more than one note appears for a node? In the latter regard, this scheme I’m using is more general (lets a node have any number of notes from 0 up, instead of just 0 or 1 as apparently implied in the question).

    Having written so much code (the pre-edit answer was about as long and helped clarify and change the specs) to provide (I hope) 99% of the desired solution, I hope this satisfies the original poster, since the last few tweaks to code and/or specs to make them match each other should be easy for him to do!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Many data analysts that I respect use version control. For example: http://github.com/hadley/ See comments
I looked through the tutorials for the python logging class here and didnt see
I have successfully piped the output of one command into the input of another
ORIGINAL (see UPDATED QUESTION below) I am designing a new laboratory database that tests
UPDATED See post #3 below. There is a need to upload a file to
Why isn't 'ALT' (variable used to determine row colour) being updated (see pic) Here
Updated question, see below I'm starting a new project and I would like to
Here's some code I saw once. Can you see what's wrong with it? [updated]
I have a python script that takes input using a pattern like this: 1**
I'm trying to set the pager used for Mercurial but the output is empty,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.