There are two style of comments , C-style and C++ style, how to recognize them?
/* comments */
// comments
I am feel free to use any methods and 3rd-libraries.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
To reliably find all comments in a Java source file, I wouldn’t use regex, but a real lexer (aka tokenizer).
Two popular choices for Java are:
Contrary to popular belief, ANTLR can also be used to create only a lexer without the parser.
Here’s a quick ANTLR demo. You need the following files in the same directory:
JavaCommentLexer.g
Main.java
Test.java
Now, to run the demo, do:
and you’ll see the following being printed to the console:
EDIT
You can create a sort of lexer with regex yourself, of course. The following demo does not handle Unicode literals inside source files, however:
Test2.java
Main2.java
If you run
Main2, the following is printed to the console: