I am trying to pull a value (only) from some XML in Python using Beautiful Soup (but I’ll gleefully dump it for anything else if recommended). Consider the following bit of code;
global humidity, temperature, weatherdescription, winddescription
query = urllib2.urlopen('http://www.google.com/ig/api?weather="Aberdeen+Scotland"')
weatherxml = query.read()
weathersoup = BeautifulSoup(weatherxml)
query.close()
print weatherxml
This prints out the weather forecast for Aberdeen, Scotland as XML (currently) thusly (much XML removed to prevent giant wall of text syndrome);
<?xml version="1.0"?><xml_api_reply version="1"><weather module_id="0"
tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0"
><forecast_information><city data="Aberdeen, Aberdeen City"/><postal_code data=""Aberdeen Scotland""/><latitude_e6
data=""/><longitude_e6 data=""/><forecast_date
data="2012-07-31"/><current_date_time data="1970-01-01 00:00:00
+0000"/><unit_system data="US"/></forecast_information><current_conditions><condition
data="Clear"/><temp_f data="55"/><temp_c data="13"/><humidity
data="Humidity: 82%"/><icon
data="/ig/images/weather/sunny.gif"/><wind_condition data="Wind: SE at
8 mph"/></current_conditions>
Now I’d like, for example, to be able to populate variables with the values of the weather in this XML, for example make temperature = 13. Parsing it is proving a nightmare.
If I use any of the find functions on weathersoup, I get the entire tag (e.g for temp_c it returns "<temp_c data="13">), various other functions return nothing, or the entire sheet, or parts of it.
How do I simply return the VALUE for any given XML tag, without a mess of “strip”s, or resorting to regex, or basically hacking it?
To access an attribute
datain elementtemp_c: