I’m trying to use youtube’s data api to search where the search term
includes Chinese characters.
But the search query is not returning correct results.
I’m using python and I just wrote some test code that uses unicode.
In the test code, I hard code a unicode term, convert it to utf-8, then url
encode it as the search term and pass it to the youtube api.
The code looks like:
yt_service = gdata.youtube.service.YouTubeService()
query = gdata.youtube.service.YouTubeVideoQuery()
u_topic = u"a-mei"
u_topic = u"阿妹" # a-mei
s_topic = u_topic.encode('utf-8')
query.vq = urllib.quote_plus(s_topic )
query.time = 'this_month'
query.orderby = 'relevance'
query.racy = 'include'
feed = yt_service.YouTubeQuery(query)
The code works when I search for u”a-mei”
but I don’t get correct results when I search for u”阿妹”
I also tried the following url:
https://gdata.youtube.com/feeds/api/videos?q=%E9%98%BF%E5%A6%B9
(here’s the url as a link: https://gdata.youtube.com/feeds/api/videos?q=%E9%98%BF%E5%A6%B9 )
where the q string is the url encoding of the utf-8 for u”阿妹”
This url returns correct results.
Thus, it seems like the youtube api allow utf-8 for search terms, but
for some reason my api call is not returning the correct results.
I believe the gdata API should take care of formatting query parameters for you. So you shouldn’t need to
urllib.quote_plusyour query manually, and doing so will result in a double-escaped string, leaving you literally searching for a video whose name is a load of percents.