I have some sample data like so: MADISON COUNTY,,,,,,,,,,,,, London, City of,,,,,,,,,,,,597,519 2.1,mill /s,(replacement),for,5

Question

0

Asked: May 28, 20262026-05-28T05:24:44+00:00 2026-05-28T05:24:44+00:00

I have some sample data like so: MADISON COUNTY,,,,,,,,,,,,, London, City of,,,,,,,,,,,,597,519 2.1,mill /s,(replacement),for,5

0

I have some sample data like so:

MADISON COUNTY,,,,,,,,,,,,, "London, City of",,,,,,,,,,,,597,519
2.1,mill /s,(replacement),for,5 years,",",commencing in,2007,",",first due in calendar year,2008,",",, for current operating expenses
-,,,,,,,,,,,,, London Public Library District,,,,,,,,,,,,716,869 1.2,mill /s,(replacement),"& increase of 1.7 mills, for 15 years, commencing in 2007, first due in",,,,,,,,,, "calendar year 2008, for
current expenses -",,,,,,,,,,,,, "Range, Township of",,,,,,,,,,,,62,13
1.7,mill /s,(renewal),for,5 years,",",commencing in,2007,",",first due in calendar year,2008,",",, for fire protection -,,,,,,,,,,,,,

What I need at the end is a list of all “Towns”, so the output should be:

["London, City of", "London Public Library District", "Range, Township of"]

I’m at a bit of a struggle here because I don’t really know how to approach narrowing it down to just these fields. As you can see the series of commas is a pretty good start, but there are also unwanted strings of commas that don’t follow the pattern. Originally I thought I would match for 5 commas on both sides of the string with length < 100 chars, but this is frustrated by the arbitrary commas here:

first due in",,,,,,,,,, "cale

Any clues?

Further, the data is generally in this format:

SOME COUNTY,,,,,,,,,,,,, SOME TOWN,,,,,,,,,,,,some long string possibly with commas
,,,,,,,,,,,,, SOME TOWN,,,,,,,,,,,,some long string possibly with commas ... etc

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T05:24:44+00:00

It’s hard to tell from your sample data as I think it has extra line breaks in it, but from your summary of the data format it seems that the Town is the 14th column in each line.

As the data is in CSV format, you don’t need to use a Regular Expression and instead can use the csv module to parse the data. Extracting the town names should be as easy as:

import csv

with open('data.csv') as f:
    for row in csv.reader(f):
        print row[13]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have some sample data like so: MADISON COUNTY,,,,,,,,,,,,, London, City of,,,,,,,,,,,,597,519 2.1,mill /s,(replacement),for,5

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply