Unmatched [ in regex; marked by <– HERE in m/ <– HERE / at ./pdf_parse.pl line 37.
Actually I’m parsing .pdf file word by word [in order to make a dictionary out of it]
line 37:-
if(grep(!/$word/,@line_rd)){
}
Well actual word where parser script stops working is in different font [in side the pdf which I’m parsing], is that the culprit here ?
Whether CAM::PDF recognizes words in different fonts ? What care should i do, in order to stop this !
You need to quote
$wordin the regular expression if it can contain special chars (like[or even.). Try with:If you want to make a dictionary of all the words, use a hash:
At the end, the
%allwordshash will contain the distinct words as keys, and the word count as values. You could e.g. print it using: