In my program, I have a string (obtained from an external library) which doesn’t match any regular expression.
String content = // extract text from PDF
assertTrue(content.matches(".*")); // fails
assertTrue(content.contains("S P E C I A L")); // passes
assertTrue(content.matches("S P E C I A L")); // fails
Any idea what might be wrong? When I print content to stdout, it looks ok.
Here is the code for extracting text from the PDF (I am using iText 5.0.1):
PdfReader reader = new PdfReader(source);
PdfTextExtractor extractor = new PdfTextExtractor(reader,
new SimpleTextExtractingPdfContentRenderListener());
return extractor.getTextFromPage(1);
By default, the
.does not match line breaks. So my guess is that yourcontentcontains a line break.Also note that
matcheswill match the entire string, not just a part of it: it does not do whatcontainsdoes!Some examples:
The
(?s)in the last example will cause the.to match line breaks as well. So(?s).*will match any string.