I am attempting to build a TokenStream from a Python Sequence.
Just for fun I want to be able to pass my own Tokens directly to
pylucene.Field("MyField", MyTokenStream)
I tried to make “MyTokenStream” by…
terms = ['pant', 'on', 'ground', 'look', 'like', 'fool']
stream = pylucene.PythonTokenStream()
for t in terms:
stream.addAttribute(pylucene.TermAttribute(t))
But unfortunately a wrapper for “TermAttribute” doesn’t exist, or for that matter any of the other Attribute classes in lucene so I get a NotImplemented error when calling them.
This doesn’t raise an exception – but I’m not not sure if it’s even setting the terms.
PythonTokenStream(terms)
The Python* classes are designed to customize behavior by subclassing. In the case of TokenStream, the incrementToken method needs to be overridden.
The result of addAttribute could also be stored, obviating the need for getAttribute. My lupyne project has an example of that.