Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8221259
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T13:50:51+00:00 2026-06-07T13:50:51+00:00

SO I have a script that pulls information from an event webpage. URL is

  • 0

SO I have a script that pulls information from an event webpage. URL is this: http://everguide.com.au/melbourne/event/2012-jul-14/colour/

This php script is calling a python script (its part of a for loop):

${"tmp" . $i} = utf8_encode (exec("python myscrape.py ${"eu" . $i}"));

It passes a URL. The python script is this:

# -*- coding: utf-8 -*-
import sys
URL = sys.argv[1]
#$URL = 'http://everguide.com.au/melbourne/event/2012-jul-14/colour/'

import urllib2
req = urllib2.Request(URL)
response = urllib2.urlopen(req)
html = response.read()

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html.decode('utf-8'))
soup.prettify()

import re


for node in soup.findAll(itemprop="name"):
    n = ''.join(node.findAll(text=True)) 
for node in soup.findAll(itemprop="url"):
    v = ''.join(node.findAll(text=True))

for node in soup.findAll("div", { "class" : "time" }):
    d = ''.join(node.findAll(text=True))

for node in soup.findAll("a", { "id" : "ctl00_holderBody_ctl00_lnkCat" }):
    c = ''.join(node.findAll(text=True)) 

vu = v
vu.encode('utf-8', 'xmlcharrefreplace')
re.escape(vu)

print n,"|", d,"|", vu,"|", c

Which works really well, but only returns up to the or pipe before VU – it cant go past that!

The UTF-8 encoding is set on all files, HTML and php.

When there is a special character in the V variable, it breaks and stops. If there are no special characters, it works perfectly.

Expected output is:

Colour | 14 July @ 7:30PM | 1000 £ Bend | Clubs & Parties

This ouutput can be seen when running the script on the server (with same python command) but over PHP – i cant get the Venue string back in!

Please help

Rick

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T13:51:00+00:00Added an answer on June 7, 2026 at 1:51 pm

    vu.encode returns encoded string … as you’re not assigning the encoded result, this is just getting thrown away. Have you tried

    vu = vu.encode('utf-8', 'xmlcharrefreplace')

    You’ll also need to skip the escape as it will mess up encoded unicode.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have this action script that pulls images from a xml file > myPhoto
I have a script that pulls in meta data from a list of URL's
I have been asked to write a script that pulls the latest code from
I have a Python script that pulls in data from many sources (databases, files,
I have a script that pulls an image from SQL Server and displays it
I have a perl script that pulls serialized php data from a database, unserializes
I have a script that pulls in some HTML to my webpage in the
I have been working on a script that pulls testimonials from a database, adds
I have a PHP script that pulls an XML file from a remote server,
I have a little python script that pulls emails from a POP mail address

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.