I am trying to send a encoded string to Solr and then decode it on retrieval. My encode looks like:
public static String compress(String inputString) {
try {
if (inputString == null || inputString.length() == 0) {
return null;
}
return new String(compress(inputString.getBytes("UTF-8")));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return null;
}
private static byte[] compress(byte[] input) {
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(input);
gzip.close();
return out.toByteArray();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
Then I send the to SOLR, and when I try to get it back (ignoring decoding for now because it fails here)
SolrDocument resultDoc = iter.next();
String content = (String) resultDoc.getFieldValue("source");
System.out.println(content);
If I send a string such as “Hello my name is Chris” the encoded will look like (ignoring what stack overflow changed);
ã�������ÛHÕ……W»≠T»KÃMU»,VpŒ( ,�ìùùG���
Yet what I get back from SOLR is
#31;ã#8;#0;#0;#0;#0;#0;#0;#0;ÛHÕ……W»≠T»KÃMU»,VpŒ( ,#6;#0;ìùùG#22;#0;#0;#0;
which will obviously make decoding fail. I have tried using the Jetty install and Tomcat both with the same issue.
See this entry from the example schema.xml file that comes with the Solr distribution.
Make sure that the field you are using to store your encoded value in the index is using the
binaryfieldType and that you are using base64 encoded strings.