I’m writing a web crawler in python, and it involves taking headlines from websites.
One of the headlines should’ve read : And the Hip’s coming, too
But instead it said: And the Hip’s coming, too
What’s going wrong here?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You need to properly decode the source text. Most likely the source text is in UTF-8 format, not ASCII.
Because you do not provide any context or code for your question it is not possible to give a direct answer.
I suggest you study how unicode and character encoding is done in Python:
http://docs.python.org/2/howto/unicode.html