Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9092965
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T22:58:12+00:00 2026-06-16T22:58:12+00:00

Good evening, I have used BeautifulSoup to extract some data from a website as

  • 0

Good evening, I have used BeautifulSoup to extract some data from a website as follows:

from BeautifulSoup import BeautifulSoup
from urllib2 import urlopen

soup = BeautifulSoup(urlopen('http://www.fsa.gov.uk/about/media/facts/fines/2002'))

table = soup.findAll('table', attrs={ "class" : "table-horizontal-line"})

print table

This gives the following output:

[<table width="70%" class="table-horizontal-line">
<tr>
<th>Amount</th>
<th>Company or person fined</th>
<th>Date</th>
<th>What was the fine for?</th>
<th>Compensation</th>
</tr>
<tr>
<td><a name="_Hlk74714257" id="_Hlk74714257">&#160;</a>£4,000,000</td>
<td><a href="/pages/library/communication/pr/2002/124.shtml">Credit Suisse First Boston International </a></td>
<td>19/12/02</td>
<td>Attempting to mislead the Japanese regulatory and tax authorities</td>
<td>&#160;</td>
</tr>
<tr>
<td>£750,000</td>
<td><a href="/pages/library/communication/pr/2002/123.shtml">Royal Bank of Scotland plc</a></td>
<td>17/12/02</td>
<td>Breaches of money laundering rules</td>
<td>&#160;</td>
</tr>
<tr>
<td>£1,000,000</td>
<td><a href="/pages/library/communication/pr/2002/118.shtml">Abbey Life Assurance Company ltd</a></td>
<td>04/12/02</td>
<td>Mortgage endowment mis-selling and other failings</td>
<td>Compensation estimated to be between £120 and £160 million</td>
</tr>
<tr>
<td>£1,350,000</td>
<td><a href="/pages/library/communication/pr/2002/087.shtml">Royal &#38; Sun Alliance Group</a></td>
<td>27/08/02</td>
<td>Pension review failings</td>
<td>Redress exceeding £32 million</td>
</tr>
<tr>
<td>£4,000</td>
<td><a href="/pubs/final/ft-inv-ins_7aug02.pdf" target="_blank">F T Investment &#38; Insurance Consultants</a></td>
<td>07/08/02</td>
<td>Pensions review failings</td>
<td>&#160;</td>
</tr>
<tr>
<td>£75,000</td>
<td><a href="/pubs/final/spe_18jun02.pdf" target="_blank">Seymour Pierce Ellis ltd</a></td>
<td>18/06/02</td>
<td>Breaches of FSA Principles ("skill, care and diligence" and "internal organization")</td>
<td>&#160;</td>
</tr>
<tr>
<td>£120,000</td>
<td><a href="/pages/library/communication/pr/2002/051.shtml">Ward Consultancy plc</a></td>
<td>14/05/02</td>
<td>Pension review failings</td>
<td>&#160;</td>
</tr>
<tr>
<td>£140,000</td>
<td><a href="/pages/library/communication/pr/2002/036.shtml">Shawlands Financial Services ltd</a> - formerly Frizzell Life &#38; Financial Planning ltd)</td>
<td>11/04/02</td>
<td>Record keeping and associated compliance breaches</td>
<td>&#160;</td>
</tr>
<tr>
<td>£5,000</td>
<td><a href="/pubs/final/woodwards_4apr02.pdf" target="_blank">Woodward's Independent Financial Advisers</a></td>
<td>04/04/02</td>
<td>Pensions review failings</td>
<td>&#160;</td>
</tr>
</table>]

I would like to export this into CSV whilst keeping the table structure as displayed on the website, is this possible and if so how?

Thanks in advance for the help.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T22:58:13+00:00Added an answer on June 16, 2026 at 10:58 pm

    Here is a basic thing you can try. This makes the assumption that the headers are all in the <th> tags, and that all subsequent data is in the <td> tags. This works in the single case you provided, but I’m sure adjustments will be necessary if other cases 🙂 The general idea is that once you find your table (here using find to pull the first one), we get the headers by iterating through all th elements, storing them in a list. Then, we create a rows list that will contain lists representing the contents of each row. This is populated by finding all td elements under tr tags and taking the text, encoding it in UTF-8 (from Unicode). You then open a CSV, writing the headers first and then writing all of the rows, but using(row for row in rows if row)` to eliminate any blank rows):

    In [117]: import csv
    
    In [118]: from bs4 import BeautifulSoup
    
    In [119]: from urllib2 import urlopen
    
    In [120]: soup = BeautifulSoup(urlopen('http://www.fsa.gov.uk/about/media/facts/fines/2002'))
    
    In [121]: table = soup.find('table', attrs={ "class" : "table-horizontal-line"})
    
    In [122]: headers = [header.text for header in table.find_all('th')]
    
    In [123]: rows = []
    
    In [124]: for row in table.find_all('tr'):
       .....:     rows.append([val.text.encode('utf8') for val in row.find_all('td')])
       .....: 
    
    In [125]: with open('output_file.csv', 'wb') as f:
       .....:     writer = csv.writer(f)
       .....:     writer.writerow(headers)
       .....:     writer.writerows(row for row in rows if row)
       .....: 
    
    In [126]: cat output_file.csv
    Amount,Company or person fined,Date,What was the fine for?,Compensation
    " £4,000,000",Credit Suisse First Boston International ,19/12/02,Attempting to mislead the Japanese regulatory and tax authorities, 
    "£750,000",Royal Bank of Scotland plc,17/12/02,Breaches of money laundering rules, 
    "£1,000,000",Abbey Life Assurance Company ltd,04/12/02,Mortgage endowment mis-selling and other failings,Compensation estimated to be between £120 and £160 million
    "£1,350,000",Royal & Sun Alliance Group,27/08/02,Pension review failings,Redress exceeding £32 million
    "£4,000",F T Investment & Insurance Consultants,07/08/02,Pensions review failings, 
    "£75,000",Seymour Pierce Ellis ltd,18/06/02,"Breaches of FSA Principles (""skill, care and diligence"" and ""internal organization"")", 
    "£120,000",Ward Consultancy plc,14/05/02,Pension review failings, 
    "£140,000",Shawlands Financial Services ltd - formerly Frizzell Life & Financial Planning ltd),11/04/02,Record keeping and associated compliance breaches, 
    "£5,000",Woodward's Independent Financial Advisers,04/04/02,Pensions review failings, 
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Good evening, I have a Dedicated Server in a data center and i want
good evening, i have a mysql table that contain the users data and i
Good evening. Can You help me please with some batch file? I have an
Good evening/morning/after/noon. I have an ASP.net 3.5 website and I am using vb.net in
Good evening, I have a website with a current css optimized for a desktop
Good evening everyone! I am working on learning some java and I have made
Good evening all! i have an Ajax call that loads 5 'echo's' from a
Good evening/morning/after/noon. I have an ASP.net 3.5 website and I am using vb.net in
Good evening, I would like to have a navigation bar which is centralised to
Good evening, I have been trying to wrap my head about this, below, code

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.