I would like to generate a graphical sitemap for my website. There are two stages, as far as I can tell:
- crawl the website and analyse the link relationship to extract the tree structure
- generate a visually pleasing render of the tree
Does anyone have advice or experience with achieving this, or know of existing work I can build on (ideally in Python)?
I came across some nice CSS for rendering the tree, but it only works for 3 levels.
Thanks
Here is a python web crawler, which should make a good starting point. Your general strategy is this:
The reason you need to do all this is, as leonm noted, that websites are graphs, not trees, and laying out graphs is a harder problem than you can do in a simple piece of javascript and css. Graphviz is good at what it does.