I’m trying to figure out how to strip content after the closing HTML tag

Question

0

Asked: May 28, 20262026-05-28T05:31:15+00:00 2026-05-28T05:31:15+00:00

I’m trying to figure out how to strip content after the closing HTML tag

0

I’m trying to figure out how to strip content after the closing HTML tag using only bash or common GNU tools. For example, given the following HTML template, what would be an efficient way to remove the trailing comment without touching the embedded comment and not using an external language such as Python?

<!DOCTYPE html>
<html>
<head>
 <title>Site | Page 1</title>
</head>
<body>

 <!-- Don't delete me! -->

</body>
</html>

<!--
Man, I really wish to vanish!
-->

The only thing I can come up with is to read the whole file into memory and process it there, i.e. something archaic as getting the location of the closing HTML tag with regex magic, truncating thereafter, and writing back out to disk.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T05:31:15+00:00

Editorial Team

2026-05-28T05:31:15+00:00Added an answer on May 28, 2026 at 5:31 am

`sed`:

sed -n '1,/<\/html>/p' some.html > truncated.html

Example:

% sed -n '1,/<\/html>/p' some.html
<!DOCTYPE html>
<html>
<head>
 <title>Site | Page 1</title>
</head>
<body>

 <!-- Don't delete me! -->

</body>
</html>

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to figure out how to strip content after the closing HTML tag

Leave an answerCancel reply

1 Answer

sed:

Example:

Where:

Leave an answer
Cancel reply

`sed`: