That really depends on if it works like that properly.…

Question

0

Asked: May 15, 20262026-05-15T10:33:54+00:00 2026-05-15T10:33:54+00:00

If you are parsing html or xml (with python), and looking for certain tags,

0

If you are parsing html or xml (with python), and looking for certain tags, it can hurt performance to lower or uppercase an entire document so that your comparisons are accurate. What percentage (estimated) of xml and html docs use any upper case characters in their tags?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T10:33:55+00:00

I think you’re overly concerned about performance. If you’re talking about arbitrary web pages, 90% of them will be HTML, not XHTML, so you should do case-insensitive comparisons. Lowercasing a string is extremely fast, and should be less than 1% of the total time of your parser. If you’re not sure, carefully time your parser on a document that’s already all lowercase, with and without the lowercase conversions.

Even a pure-Python implementation of lower() would be negligible compared to the rest of the parsing, but it’s better than that – CPython implements lower() in C code, so it really is as fast as possible.

Remember, premature optimization is the root of all evil. Make your program correct first, then make it fast.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions