I am just trying to write a small web page that can parse some text using a regular expression and return the resulting matches in a table. This is the first I’ve used python for web development, and I have to say, it looks messy.
My question is why do I only get output for the last match in my data set? I figure it has to be because the nested loops aren’t formatted correctly.
Here’s the data I provide:
groups is just an id correspoding to the regex group, and it’s name to provide the header for the table.
pattern is something like:
(\d+)\s(\S+)\s(\S+)$
and data:
12345 SOME USER 09876 SOMEONE ELSE 54678 ANOTHER USER
My simple page:
<% import re pattern = form['pattern'] p = re.compile(pattern) data = form['data'] matches = p.finditer(data) lines = form['groups'].split('\n') groupids ={} for line in lines: key, val = line.split(' ') groupids[int(key.strip())] = val.strip() %> <html> <table style='border-width:1px;border-style:solid;width:60%;'> <tr> <% for k,v in groupids.iteritems():%> <th style='width:30px;text-align:center'><%= v %></th> <% # end %> </tr> <% for match in matches: #begin %><tr> <% for i in range(1, len(match.groups())+1): #begin %> <td style='border-style:solid;border-width:1px;border-spacing:0px;text-align:center;'><%= match.group(i) %></td> <% #end # end %> </tr> </table> </html>
Edit
Below is the test I ran
Code:
import re pattern = '(\d\d\d\d\d)\s(\S+)\s(\S+)' p = re.compile(pattern) data = '''12345 TESTS USERS 34567 TESTS USERS 56789 TESTS USERS''' groups = '''1 PIN 2 FNAME 3 LNAME''' matches = p.finditer(data) lines = groups.split('\n') print lines groupids ={} for line in lines: key, val = line.split(' ') groupids[int(key.strip())] = val.strip() for k,v in groupids.iteritems(): print '%s\t' % v, print '' for match in matches: for i in range(1, len(match.groups())+1): print '%s\t' % match.group(i), print ''
Output:
PIN FNAME LNAME 12345 TESTS USERS 34567 TESTS USERS 56789 TESTS USERS
Yeah, you haven’t got a nested loop there. Instead you’ve got a loop over
matchesthat outputs “<tr>\n”, then a second loop overrange(...)that only runs after the first has finished. The second is not inside the first because it isn’t indented to say so.From the doc, I think what you need to be saying is:
But I can only agree with your “messy” comment: if PSP is requiring that you torture the indenting of your HTML to fit the structure of your Python like this, it is really Doing It Wrong and you should look for another, less awful templating syntax. There are many, many templating languages for Python that have a more sensible syntax for control structures. As an example, in the one I use the above would look like: