I’ve currently written an algorithm in Ruby based on the arc90 readability code to extract an article from a web page.
Now that I have the article, I want to extract keywords and specific information from it (names, author, etc)
I heard Alchemy was a great ruby gem for doing this though it consumes a lot of resources. Are there any better gems I can use for this?
There is an OpenCalais gem which provides similar capability. In addition to entity extraction it can also detect events and relations between entities. It’s not lightweight, though I couldn’t tell if it’s better or worse than Alchemy as I haven’t used the Alchemy gem. Hope this helps.