Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 790195
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T21:38:39+00:00 2026-05-14T21:38:39+00:00

I am studying fulltext search engines for django. It must be simple to install,

  • 0

I am studying fulltext search engines for django.
It must be simple to install, fast indexing, fast index update, not blocking while indexing, fast search.

After reading many web pages, I put in short list :
Mysql MYISAM fulltext, djapian/python-xapian, and django-sphinx
I did not choose lucene because it seems complex, nor haystack as it has less features than djapian/django-spĥinx (like fields weighting).

Then I made some benchmarks, to do so, I collected many free books on the net to generate a database table with 1 485 000 records (id,title,body), each record is about 600 bytes long.
From the database, I also generated a list of 100 000 existing words and shuffled them to create a search list. For the tests, I made 2 runs on my laptop (4Go RAM, Dual core 2.0Ghz): the first one, just after a server reboot to clear all caches, the second is done juste after in order to test how good are cached results. Here are the “home made” benchmark results :

1485000 records with Title (150 bytes) and body (450 bytes)

Mysql 5.0.75/Ubuntu 9.04 Fulltext :
==========================================================================

Full indexing : 7m14.146s

1 thread, 1000 searchs with single word randomly taken from database : 
First run : 0:01:11.553524
next run : 0:00:00.168508

Mysql 5.5.4 m3/Ubuntu 9.04 Fulltext :
==========================================================================

Full indexing : 6m08.154s

1 thread, 1000 searchs with single word randomly taken from database : 
First run : 0:01:09.553524
next run : 0:00:20.316903

1 thread, 100000 searchs with single word randomly taken from database : 
First run : 9m09s
next run : 5m38s

1 thread, 10000 random strings (random strings should not be found in database) :
just after the 100000 search test : 0:00:15.007353

1 thread, boolean search : 1000 x (+word1 +word2) 
First run : 0:00:21.205404
next run : 0:00:00.145098

Djapian Fulltext : 
==========================================================================

Full indexing : 84m7.601s

1 thread, 1000 searchs with single word randomly taken from database with prefetch : 
First run : 0:02:28.085680
next run : 0:00:14.300236

python-xapian Fulltext :
==========================================================================

1 thread, 1000 searchs with single word randomly taken from database : 
First run : 0:01:26.402084
next run : 0:00:00.695092

django-sphinx Fulltext :
==========================================================================

Full indexing : 1m25.957s

1 thread, 1000 searchs with single word randomly taken from database : 
First run : 0:01:30.073001
next run : 0:00:05.203294

1 thread, 100000 searchs with single word randomly taken from database : 
First run : 12m48s
next run : 9m45s

1 thread, 10000 random strings (random strings should not be found in database) :
just after the 100000 search test : 0:00:23.535319

1 thread, boolean search : 1000 x (word1 word2) 
First run : 0:00:20.856486
next run : 0:00:03.005416

As you can see, Mysql is not so bad at all for fulltext search. In addition, its query cache is very efficient.

Mysql seems to me a good choice as there is nothing to install (I need just to write a small script to synchronize an Innodb production table to a MyISAM search table) and as I do not really need advanced search feature like stemming etc…

Here is the question : What do you think about Mysql fulltext search engine vs sphinx and xapian ?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T21:38:39+00:00Added an answer on May 14, 2026 at 9:38 pm

    I haven’t tested Xapian but I did a presentation last year comparing fulltext solutions:
    http://www.slideshare.net/billkarwin/practical-full-text-search-with-my-sql

    Sphinx is the fastest at searches. But it’s hard to index data that comes in incrementally, because adding data to an index is about as expensive as creating the whole index from scratch.

    So some people maintain two Sphinx indexes: one large index with archived data, and one small index with recent data. Periodically (e.g. weekly) they merge the recent index into the archived index (merging two indexes is less expensive), and truncate the small index to prepare for the new week. This works great for something like a forum, but not as good for a wiki.

    You might also check out Apache Solr. This is a wrapper for Lucene, and it makes using Lucene a lot easier and yet more featureful. I didn’t know about Solr when I designed that presentation.

    The Washington Times is an example of a project that uses Solr together with Django:

    • http://www.screeley.com/djangosolr/
    • http://www.chrisumbel.com/article/django_solr
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

While studying for the 70-433 exam I noticed you can create a covering index
While studying about JMX, I have seen one of the important feature of it
While studying for the Zend PHP Exam I came across the following contradicting information:
While studying Java tutorials, Reflection and Late Binding have confused me. In some tutorials,
While studying the Collection API, we find that some methods ( add , remove
While studying C# in ASP.net I have trouble understanding several classes. In which scenario
i studying TDateTime functions and procedure, but not found something that allow me to
After studying the docs for a while, I came up with my first jQuery
While studying different programming languages, I recently hit upon Icon programming language . It
While studying for the SCJP 6 exam, I ran into this question in a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.