I am supposed to read a text file via Java and blank out all the e-mail ids and URLs in the text file. This is to reduce noise in the data.
Are there any library functions in java to do the same?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You can read the file in using a FileInputStream and/or a BufferedReader. You can parse each line and use a regex to see if there are any matches for email or URL patterns, and create a new output string or stream to write them out.
Show us what you’ve tried and your current code.
As an addendum, I’ve used these:
http://www.regular-expressions.info/email.html
http://daringfireball.net/2009/11/liberal_regex_for_matching_urls
With varying degrees of success.