I am trying to parse a simple html table using beautifulsoup but I have

Question

0

Editorial Team

Asked: June 7, 20262026-06-07T16:39:46+00:00 2026-06-07T16:39:46+00:00

I am trying to parse a simple html table using beautifulsoup but I have

0

I am trying to parse a simple html table using beautifulsoup but I have some problems

Here is my Input

<table id="people" class="tt" width="99%" border="0" cellpadding="0" cellspacing="1">
 <tr>
  <td colspan="3" bgcolor="#d3d3d3">
   <p align="center" style="border: 1px solid #c0c0c0; padding: 0.02in">
    <a name="faculty">
    </a>
    <b>
     Faculty
    </b>
   </p>
  </td>
 </tr>
 <tr>
  <td>
   <p align="center">
    <font color="#000080">
     <a href="http://www.website.com/%7Empop">
      <font color="#000080">
       <img src="images/mpop.jpg" name="graphics1" align="bottom" width="70" height="85" border="1" />
      </font>
     </a>
    </font>
   </p>
  </td>
  <td>
   <p>
    <b>
     John Doe, Ph.D.
    </b>
    <br />
    Associate Professor, Computer
                Science
    <br />

   </p>
  </td>
  <td>
   <p>
    Office:  Sciences Bldg.
    <br />
    Phone:
                xxx-xxx-xxxx
    <br />
    jd [at] website.com
    <br />
       </p>
  </td>
 </tr>
 <tr>
  <td>
   <p align="center">
    <font color="#000080">
     <a href="http://www.website.com/%7Ercolwell">
      <font color="#000080">
       <img src="images/rcolwell.jpg" name="graphics2" align="bottom" width="70" height="97" border="1" />
      </font>
     </a>
    </font>
   </p>
  </td>
  <td>
   <p>
    <b>
     Jane Doe, Ph.D.
    </b>
    <br />
     Professor
    <br />
  School of Public Health
    <br />
   </p>
  </td>
  <td>
   <p>
    Sciences Bldg
    <br />
    jd [at]
                website.com
    <br />

    </a>
   </p>
  </td>
 </tr>
</table>

Here is my code

t = soup.findAll("table",id="people")
for table in t:
    rows = table.findAll("tr")
    for tr in rows:
        cols = tr.findAll("td")
        for td in cols:
            print(str(td.find(text=True))) # tried also print(td.find(text=True))
            print(",")
        print("\n")

This will generate output with only commas without the text actually, but when I put print(td) I do find the information that I need to output but in html format with all the tags, can anyone point me to the right thing to do here ? I want to extract only the cell content.

Cheers

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T16:39:48+00:00

Maybe you are looking for s.t. like this:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup("<table id=people><tr><td>x<a>y</a>z</td><td>x<a>y</a>z</td></tr></table>")
t = soup.findAll("table",id="people")
for table in t:
   rows = table.findAll("tr")
   for tr in rows:
      cols = tr.findAll("td")
      print(','.join([td.text for td in cols]))

Alternatively you can use u''.join(map(unicode, td.contents)) depending on what exactly you want to get printed.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to parse a simple html table using beautifulsoup but I have

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply