I am trying to do Part of String tagging to pull out the nouns of a sentence in Python on Google App Engine. So far I have tried to use the nltk library. But I am unable to get nltk working in GAE. The error message complains about a missing numpy module.
This person has had the same problem:
https://groups.google.com/forum/?fromgroups#!topic/nltk-users/2nWZtLgFyvI
I cannot find clear instructions on how to get nltk running on GAE or an alternative POS tagger that runs on GAE
EDIT:
My steps trying to get nltk working (I’m on osx 10.7):
- install nltk via terminal “easy_install nltk”
- copy nltk to root of appengine project /Library/Python/2.7/site-packages/nltk-2.0.1-py2.7.egg/nltk/
-
add the following settings to app.yaml:
runtime: python27 threadsafe: false libraries: name: numpy version: "latest" -
write test.py with
import nltkin it - deploy, run and get the following error (the numpy error is solved, but I get a new one):
Traceback (most recent call last): File
“/base/data/home/apps/s~domain/1.359540170137090086/dynamic/test.py”,
line 4, in
import nltk File “/base/data/home/apps/s~domain/1.359540170137090086/nltk/init.py”,
line 116, in
import ccg File “/base/data/home/apps/s~domain/1.359540170137090086/nltk/ccg/init.py”,
line 14, in
from nltk.ccg.combinator import (UndirectedBinaryCombinator, DirectedBinaryCombinator, File
“/base/data/home/apps/s~domain/1.359540170137090086/nltk/ccg/combinator.py”,
line 8, in
from nltk.parse import ParserI File “/base/data/home/apps/s~domain/1.359540170137090086/nltk/parse/init.py”,
line 68, in
from nltk.parse.util import load_parser, TestGrammar, extract_test_sentences File
“/base/data/home/apps/s~domain/1.359540170137090086/nltk/parse/util.py”,
line 15, in
from nltk.data import load File “/base/data/home/apps/s~domain/1.359540170137090086/nltk/data.py”,
line 75, in
if os.path.expanduser(‘~/’) != ‘~/’: path += [ File “/base/python27_runtime/python27_dist/lib/python2.7/posixpath.py”,
line 259, in expanduser
import pwd ImportError: No module named pwd
The following is from nltk/data.py (around line 75):
######################################################################
# Search Path
######################################################################
path = []
"""A list of directories where the NLTK data package might reside.
These directories will be checked in order when looking for a
resource in the data package. Note that this allows users to
substitute in their own versions of resources, if they have them
(e.g., in their home directory under ~/nltk_data)."""
# User-specified locations:
path += [d for d in os.environ.get('NLTK_DATA', '').split(os.pathsep) if d]
if os.path.expanduser('~/') != '~/': path += [
os.path.expanduser('~/nltk_data')]
# Common locations on Windows:
if sys.platform.startswith('win'): path += [
r'C:\nltk_data', r'D:\nltk_data', r'E:\nltk_data',
os.path.join(sys.prefix, 'nltk_data'),
os.path.join(sys.prefix, 'lib', 'nltk_data'),
os.path.join(os.environ.get('APPDATA', 'C:\\'), 'nltk_data')]
# Common locations on UNIX & OS X:
else: path += [
'/usr/share/nltk_data',
'/usr/local/share/nltk_data',
'/usr/lib/nltk_data',
'/usr/local/lib/nltk_data']
GAE for python27 supports numpy 1.6.1. Are you specifying
in your
app.yaml? The link you gave pre-dates Python 2.7 support, so I’m guessing not.