Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9067939
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T17:04:10+00:00 2026-06-16T17:04:10+00:00

I’d like to incorporate a custom tagger into a web application (running on Pyramid)

  • 0

I’d like to incorporate a custom tagger into a web application (running on Pyramid) I’m developing. I have the tagger working fine on my local machine using NLTK, but I’ve read that NLTK is relatively slow for production.

It seems that the standard way of storing the tagger is to Pickle it. On my machine, it takes a few seconds to load the 11.7MB pickle file.

  1. Is NLTK even practical for production? Should I be looking at scikit-learn or even something like Mahout?

  2. If NLTK is good enough, what is the best way to ensure that it properly uses memory, etc.?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T17:04:11+00:00Added an answer on June 16, 2026 at 5:04 pm

    I run text-processing and its associated NLP APIs, and it uses about 2 dozen different pickled models, which are loaded by a Django app (gunicorn behind nginx). The models are loaded as soon as they are needed, and once loaded, they stay in memory. That means whenever I restart the gunicorn server, the first requests that need a model have to wait a few seconds for it load, but every subsequent request gets to use the model that’s already cached in RAM. Restarts only happen when I deploy new features, which usually involves updating the models, so I’d need to reload them anyway. So if you don’t expect to make code changes very often, and don’t have strong requirements on consistent request times, then you probably don’t need a separate daemon.

    Other than the initial load time, the main limiting factor is memory. I currently only have 1 worker process, because when all the models are loaded into memory, a single process can take up to 1GB (YMMV, and for a single 11MB pickle file, your memory requirements will be much lower). Processing an individual request with an already loaded model is fast enough (usually <50ms) that I currently don’t need more than 1 worker, and if I did, the simplest solution is to add enough RAM to run more worker processes.

    If you are worried about memory, then do look into scikit-learn, since equivalent models can use significantly less memory than NLTK. But, they are not necessarily faster or more accurate.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a French site that I want to parse, but am running into
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
link Im having trouble converting the html entites into html characters, (&# 8217;) i
this is what i have right now Drawing an RSS feed into the php,
I would like my Web page http://www.gmarks.org/math_in_e-mail.txt on my Apache 2.2.14 server to display
I have two tables with like below codes: Table: Accounts id | username |
I am currently running into a problem where an element is coming back from
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
For some reason, after submitting a string like this Jack’s Spindle from a text
That's pretty much it. I'm using Nokogiri to scrape a web page what has

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.