I am trying to process some files that are named xls and can be

Question

0

Asked: May 30, 20262026-05-30T12:54:01+00:00 2026-05-30T12:54:01+00:00

I am trying to process some files that are named xls and can be

0

I am trying to process some files that are named xls and can be opened in Excel but they are web archive files There are some nested tables, I want to work first with only the non-nested tables. I thought I could catch the non-nested tables by looking only for those tables whose parent element had a body tag but for none of my tables is table.get_parent().tag==’body’ true. Even for the table snip below the tag of the parent element of that particular table is a div tag

<html>
  <head>
    <META http-equiv=3DContent-Type content=3D'text/html; charset=utf-8'><script type=3Dtext/javascript src=3DShow.js>/* Do Not Remove This Comment */</script></head>
  <body>
    <table class=3Dreport id=3DID0EI>
      <tr>
        <th>

I checked and the body tag is closed as is the table tag.

table.getparent()

returns

     <Element div at 9f05f10>

note, I am getting my tables by reading in the document as a string and following these general steps

myTree=html.fromstring(someString)
tables=myTree.cssselect('table')


tables=theTree.cssselect('table')

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T12:54:02+00:00

Editorial Team

2026-05-30T12:54:02+00:00Added an answer on May 30, 2026 at 12:54 pm

xpath to the rescue

tree = html.fromstring(someString)
table_tops = set(tree.xpath('//table'))-set(tree.xpath('//table//table'))

There is probably some fancy xpath (that some SO smarty will post) to do it but this should be super fast (and easy to read)

Update
css version same idea

myTree=html.fromstring(someString)
table_tops = set(myTree.cssselect('table'))-set(myTree.cssselect('table table'))

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to process some files that are named xls and can be

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply