I have a document that was converted from PDF to HTML for use on

Question

0

Asked: May 15, 20262026-05-15T09:44:16+00:00 2026-05-15T09:44:16+00:00

I have a document that was converted from PDF to HTML for use on

0

I have a document that was converted from PDF to HTML for use on a company website to be referenced and indexed for search. I’m attempting to format the converted document to meet my needs and in doing so I am attempting to clean up some of the junk that was pulled over from when it was a PDF such as page numbers, headers, and footers. luckily all of these lines that need to be removed are in blocks of 4 lines unfortunately they are not exactly the same therefore cannot be removed with a simple literal replace. The lines contain numbers which are incremental as they correlate with the pages. How can I remove the following example from my html file.

Title<br>
10<br>
<hr>
<A name=11></a>Footer<br>

I’ve tried many different regular expression attempts but as my skill in that area is limited I can’t find the proper syntax. I’m sure i’m missing something fairly easy as it would seem all I need is a wildcard replace for the two numbers in the code and the rest is literal.

any help is apprciated

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T09:44:17+00:00

The search & replace of npp is quite odd. I can’t find newline charactes with regular expression, although the documentation says:

As of v4.9 the Simple find/replace (control+h) has changed, allowing the use of \r \n and \t in regex mode and the extended mode.

I updated to the last version, but it just doesn’t work. Using the extended mode allows me to find newlines, but I can’t specify wildcards.

However, you can use the macros to overcome this problems.

prepare a search that will find a unique passage (like Title<br>\r\n, here you can use the extended mode)
start recording a macro
press F3 to use your search
mark the four lines and delete them
stop recording the macro … done!

Just replay it and it deletes what you wanted to delete.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a document that was converted from PDF to HTML for use on

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply