Based on your db diagram, the best way to do…

Question

0

Asked: May 11, 20262026-05-11T18:27:38+00:00 2026-05-11T18:27:38+00:00

It seems that flex doesn’t support UTF-8 input. Whenever the scanner encounter a non-ASCII

0

It seems that flex doesn’t support UTF-8 input. Whenever the scanner encounter a non-ASCII char, it stops scanning as if it was an EOF.

Is there a way to force flex to eat my UTF-8 chars? I don’t want it to actually match UTF-8 chars, just eat them when using the ‘.’ pattern.

Any suggestion?

EDIT

The most simple solution would be:

ANY [\x00-\xff]

and use ‘ANY’ instead of ‘.’ in my rules.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T18:27:38+00:00

I have been looking into this myself and reading the Flex mailing list to see if anyone thought about it. To get Flex to read unicode is a complex affair …

UTF-8 encoding can be done, and most other encodings (the 16s) will lead to massive tables driving the automata.

A common method so far is:

What I did was simply write patterns that match single UTF-8
characters. They look something like
the following, but you might want to
re-read the UTF-8 specification
because I wrote this so long ago.
You will of course need to combine
these since you want unicode strings,
not just single characters.

UB [\200-\277] %% 
[\300-\337]{UB}                   { do something } 
[\340-\357]{UB}{2}                { do something } 
[\360-\367]{UB}{3}                { do something } 
[\370-\373]{UB}{4}                { do something } 
[\374-\375]{UB}{5}                { do something }

Taken from the mailing list.

I may look at creating a proper patch for UTF-8 support after looking at it further. The above solution seems unmaintainable for large .l files. And is really ugly! You could use ranges similar to create a ‘.’ substitute rule to match all ASCII and UTF-8 characters, but still rather ugly.

hope this helps!

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions