I’m building web app and using django and Sphinx for free text search. I need to apply additional restrictions before making request to searchd, consider 2 tables:
Entity
id
title
description
created_by_id
updated_by_id
created_date
updated_date
and
EntityUser
id
entity_id [FK to the table above]
joining_user_id
is_approved
created_by_id
updated_by_id
created_date
updated_date
I’ve built RT index for main table Entity, all works fine, but then I want to make a query only on those entities to which user has joined, i.e. where for specific user_id & entity_id exists record in EntityUser with is_approved=1. Problem is that I can’t index EntityUser, because there are no string fields – this table only holds integers/timestamps as you see. Not sure if I could make a query in SphinxQL containing subquery to another idex even if I could build index for that table. Knowing that Sphinx was used for quite big projects with great success, I doubt it’s a limitation of Sphinx – is it bad design of DB/application or leak of knowledge how to build proper RT index? Can I somehow extend existing index so that I can use restriction above?
I was thinking that I could apply the additional restrictions after Sphinx returns IDs of records on MySQL side, but that’s not going to work: N records with highest weight would be returned, but after applying additional restrictions the result could be empty. So I need to get an area of search and then perform query only on those entities user can possibly see.
Actually I’ve found the answer and it has nothing to do with the design of application or DB.
In fact that’s simple – I just need to use MVA for RT index as I would do for plain one (rt_attr_multi or rt_attr_multi_64). In configuration file I will have to do something like this:
...
rt_attr_multi = entity_users
}
and then populate it with IDs of users which have joined the Entity and have been approved. Problem was that I couldn’t understand how to use MVA with RT index, but not it’s clear. There are not enough real-word examples with RT indexes and MVA I think, so I’ve shared this to help to solve similar problems.
UPDATE: was fighting last hour to generate RT index and always was getting “unknown column: ‘entity_users'”. Finally found the reason – if you add MVA to RT index (don’t know if that’s the same for plain), you’ve got to not only restart searchd daemon (service), but also DELETE everything you have in “data” folder (or where you have stored your index)!