Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6661151
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T02:14:38+00:00 2026-05-26T02:14:38+00:00

This is a continuation of the project outlined in this question. I have the

  • 0

This is a continuation of the project outlined in this question.

I have the following model:

class Product {
  public string Id { get; set; }
  public string[] Specs { get; set; }
  public int CategoryId { get; set; }
}

The “Specs” array stores product specification name value pairs joined by a special character. For example if a product is colored blue the spec string would be “Color~Blue”. Representing specs in this way allows querying for products having multiple spec values specified by a query. There are two principal queries that I would like to support:

  1. Get all products in a given category.
  2. Get all products in a given category which have a set of specified specs.

This works well with RavenDB. However, in addition to the products satisfying a given query I would like to return a result set which contains all spec name-value pairs for the set of products specified by the query. The spec name-value pairs should be grouped by the name and value of the spec and contain a count of products which have a given spec name-value pair. For query #1 I created the following map reduce index:

class CategorySpecGroups {
    public int CategoryId { get; set; }
    public string Spec { get; set; }
    public int Count { get; set; }
}


public class SpecGroups_ByCategoryId : AbstractIndexCreationTask<Product, CategorySpecGroups>
{
    public SpecGroups_ByCategoryId()
    {
        this.Map = products => from product in products
                               where product.Specs != null
                               from spec in product.Specs
                               select new
                               {
                                   CategoryId = product.CategoryId,
                                   Spec = spec,
                                   Count = 1
                               };

        this.Reduce = results => from result in results
                                 group result by new { result.CategoryId, result.Spec } into g
                                 select new
                                 {
                                     CategoryId = g.Key.CategoryId,
                                     Spec = g.Key.Spec,
                                     Count = g.Sum(x => x.Count)
                                 };
    }
}

I can then query this index and get all spec name-value pairs in a given category. The problem I am running into is to get the same result set but for a query which filters both by a category and a set of spec name-value pairs. When using SQL this result set would be obtained by doing a group by over a set of products filtered by category and specs. In general, this type of query is expensive but when filtering by both category and specs the product sets are normally small, though not small enough to fit into a single page – they may contain up to 1000 products. For reference, MongoDB supports a group method which can be used to achieve the same result set. This performs the ad hoc grouping server side and the performance is acceptable.

How can I get this type of result set using RavenDB?

One possible solution is to get all the products for a query and perform the grouping in memory and another option is to create a mapreduce index as above, though the challenge with this would be deducing all possible spec selections that can be made for a given category and additionally, this type of index might explode in size.

For an example, take a look at this fastener category page. The user can filter their selection by selecting attributes. When an attribute is selected it narrows the selection of products and displays the attributes within the new set of products. This type of interaction is typically called faceted search.

EDIT

In the meantime, I will be attempting a solution using Solr as they support faceted search out of the box.

EDIT 2

It appears that RavenDB also supports faceted search (which of course makes sense, indexes are stored by Lucene just like Solr). I will be exploring this and post updates.

EDIT 3

The RavenDB faceted search functionality works as expected. I store a facet setup document for each category ID which is used to calculate facets for a query within a given category. The issue I am having now is performance. For a collection of 500k products with 4500 distinct categories resulting in 4500 facet setup documents a query by category id takes about 16 seconds when also querying for facets and about 0.05 seconds when not querying for facets. The particular category tested contains about 6k products, 23 distinct facets and 2k distinct facet name-range combinations. After looking at the code in FacetedQueryRunner it appears a facets query will result in a Lucene query for every facet name-value combination to get the counts, as a well as a query for each facet name to get the terms. One problem with the implementation is that it will retrieve all the distinct terms for a given facet name regardless of the query, which in most cases will significantly reduce the number of terms for a facet and therefore reduce the number of Lucene queries. One way to improve performance here would be to store a MapReduce computed result set (as shown above) for each facet setup document which could then be queried to get all the distinct terms when further filtering by facets. The overall performance however may still be too slow.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T02:14:39+00:00Added an answer on May 26, 2026 at 2:14 am

    I’ve implemented this feature using RavenDB faceted search, however I made some changes to FacetedQueryRunner to support a heuristic optimization. The heuristic is that, in my case, facets are only displayed in leaf categories. This is a reasonable constraint since navigation between root and internal categories can be driven by either search or listings of child categories.

    Now given the constraint I store a FacetSetup document for each leaf category with the Id being something like “facets/category_123”. When the facet setup document is being stored I have access to the facet names as well as facet values (or ranges) that are contained in the category. Therefore, I can store all available facet values in the Ranges collection of each Facet in the FacetSetup document, however the facet mode is still FacetMode.Default.

    Here are the changes to FacetedQueryRunner. Specifically, the optimization checks to see if a given facet stores ranges, in which case it returns those values to use for searching instead of getting all terms in an index associated with a given facet. In most cases this will significantly reduce the number of Lucene searches that are required since there available facet values in a given category are a subset of facet values in the entire index.

    The next optimization that can be made is that if the original query only filters by a category id, then the FacetSetup document can actually store the counts as well. One, albeit hacky, way to do this would be to append the count to each facet value in the Ranges collection, then add a boolean to FacetSetup document to indicate that counts are appended. Now this facet query will basically return the values in the FacetSetup document – no need to query.

    A consideration now would be to keep the FacetSetup documents up to date, however this would be required either way. Beyond this optimization caching can be utilized, which is I believe the approach taken by Solr faceted search.

    Furthermore, it would be nice if the FacetSetup documents where automatically synchronized with the product collection since effectively they are result of an aggregating MapReduce operation over the set of products grouping initially by category id, then the name of the facet and then the values.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This is a continuation of this question Model class and Mapping I had my
This a continuation of this question. I have an Address class which contains basic
This is a continuation of Why do we have to set __block variable to
This is a continuation of the question here: JBoss - does app have to
This is sort of a continuation of this question . I have a file
This post is an continuation of this post I have DlUser Class each object
This is a continuation of the question Java rounded Swing JButton . I have
This question is in continuation of this post , I have tried installing Xerces-C
This is a continuation of a previous question . I have: var X =
This is a chronological continuation of this question. I have simplified my board so

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.