I have an Ruby on Rails 3 Heroku application, which needs to perform text

Question

0

Asked: May 29, 20262026-05-29T22:59:12+00:00 2026-05-29T22:59:12+00:00

I have an Ruby on Rails 3 Heroku application, which needs to perform text

0

I have an Ruby on Rails 3 Heroku application, which needs to perform text search on a few models. Each models have a large datasets, and that dataset is expected to grow considerably.

I want to be able to do fast text search on columns like title and description. Simple queries, like give me all Articles having “postgresql” (case insensitive) in their title, or body. I need multilingual capability too.

Currently, my DB is not being used in production, and I’m using the Ronin plan, which gives a dedicated db using PostgreSQL.

In order to do that, I decided to go with a plugin call texticle. That plugin allows full text search using PostgreSQL capability. However, it did not work smoothly, and I decided to build full text indexes.

I ran the following query, on a table with 15 millions entries. 20 hours later, it is still running.

create index on articles using gin(to_tsvector('english', title));

My questions :

1- Is it normal that it is so long for this index to build?

2- Is there any way to find out the status of that index build-up? It doesn’t show yet in my indexes usage table.

3- What about my approach. Am I looking at this the wrongway? Would you have other recommendations? I would like to keep my budget low for now, but be able to easily migrate to an effective production quality solution when needs arise, a scalable one.

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T22:59:13+00:00

1- Is it normal that it is so long for this index to build?

No.

This is on my postgres 9.0 server which runs on single-core AMD Athlon 64 3700+:

filip@filip=# create table articles as select i, md5('the ' || random()::text || ' feds took my ' || random()::text ) as title from generate_series(1,15000000) i;
SELECT 15000000
Time: 91851.97 ms
filip@filip=# create index on articles using gin(to_tsvector('english', title));
CREATE INDEX
Time: 340802.395 ms

As you can see, on building GIN index on 15 Mrows took 340 seconds (BTW, table size was 977 MB and index size was 319 MB).

Turning text documents into tsvector and building a GIN (or GIST) index is CPU-intensive.

I don’t know exact specs of heroku ronin in terms of CPU power. Can you tell us what it compares to?

Performance of index building is also very sensitive to maintenance_work_mem setting. Memory needed (and size of the index) depends on input data, might be from 20% to 150% of input data size.

2- Is there any way to find out the status of that index build-up? It
doesn’t show yet in my indexes usage table.

Unfortunately, no. PostgreSQL does not have this kind of “introspection”.

You could create same index on a 10% sample and multiply to estimate.

3- What about my approach.

Nothing bad – it is OK, at last if PostgreSQL has built-in FTS, it’s good to begin with.

But if you need faster solution (both indexing time and searching speed) – the only way is to go out of database. External solutions like Sphinx or Lucene are faster (10x from my experience).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an Ruby on Rails 3 Heroku application, which needs to perform text

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply