Is there a fast and efficient text search in a Unicode text/string? I need

Question

0

Asked: May 24, 20262026-05-24T02:18:30+00:00 2026-05-24T02:18:30+00:00

Is there a fast and efficient text search in a Unicode text/string? I need

0

Is there a fast and efficient text search in a Unicode text/string? I need to search a part of a word too, not only a whole word.

SearchBuf?

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T02:18:30+00:00

As has already been pointed out, the fastest way of doing this depends on a number of things, most importantly whether you need to be able to search repeatedly or not. The second question is how important is it to you to really have the “fastest” approach rather than a reasonably fast approach and the amount of time you are willing to invest in optimisations.

Repeated searches

If you need to search repeatedly, the most efficient way for string searching I know of is by the use of suffix arrays (often combined with Burrows-Wheeler transforms). This approach is used extensively in bioinformatics where one often has to deal with a huge number of string searches over really large data sets (e.g. here). A very good suffix array (and BWT) library is the libdivsufsort C library, but unfortunately I know of no Delphi port of this library. (I believe this library is capable of handling unicode strings.)

Single searches

If you don’t need to search repeatedly, a brute-force string search algorithm can be efficient, for instance the assembly-optimised FastCode versions of Pos and friends. These were, however, written before Delphi was unicode-ified and I know of no similar optimised unicode implementations. If I were to write one today and wanted to optimise it for a modern processor (capable of the SSE4.2 instruction set), I would have a serious look at the PCMPESTRI assembly instruction (reference pdf here; see also e.g. here, but I have no idea whether that code is working), which can handle the 2-byte characters you’d need for unicode string searching.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there a fast and efficient text search in a Unicode text/string? I need

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply