I have a weird issue when connecting to Oracle with Groovy.
I have created the following test table:
CREATE TABLE t (text VARCHAR2 (256));
INSERT INTO t VALUES ('[Hallo][Hällo][Hello][Hi]');
I want to find all the substrings that are enclosed in square brackets. The following Groovy code fails to find the second ([Hällo]):
import groovy.sql.Sql
sql = Sql.newInstance('jdbc:oracle:thin:@server:1521:ORCL', 'user',
'password', 'oracle.jdbc.OracleDriver');
sql.eachRow("select text from t") { row ->
row.text.eachMatch(/\[[A-Za-zä\-]+\]/) { match ->
println match
}
}
Using the string directly works as expected:
'[Hallo][Hällo][Hello][Hi]'.eachMatch(/\[[A-Za-zä\-]+\]/) { match ->
println match
}
Also doing the same thing from good ol’ Java works fine. So I’m guessing that the problem should exist somewhere inside the Groovy SQL object.
One final thing I noticed is that the two strings (getting the string from the result set vs embedding it to the source code) do not have the same encoding. When I print Hällo inside eachRow I get H?llo in the Windows console, but when I print it directly I get H├νllo instead.
After further experimentation, the problem seems to be that Groovy reads my script with the default platform encoding. If I pass
-c UTF8to the Groovy interpreter, then I get the expected results.