Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7569275
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T15:07:32+00:00 2026-05-30T15:07:32+00:00

i’m facing a design problem within my project. PROBLEM i need to query solr

  • 0

i’m facing a design problem within my project.

PROBLEM
i need to query solr with all the possible combinations (more or less 20 millions) of some parameters extracted from our lists, to test wether they give at least 1 result. in the case they don’t, that combination is inserted into a blacklist (used for statistical analysis and sitemap creation)

HOW I’M DOING IT NOW
nested for loops to combine parameters (extracted from python lists) and pass them to a method (the same i use in production environment to query the db within the website) that tests for 0-results. if it’s 0, there’s a method inserting inside the blacklist
no threading involved

HOW I’D LIKE TO TO THIS
i’d like to put all the combinations inside a queue and let a thread object pull them, query and insert, for better performances

WHAT PROBLEMS I’M EXPERIENCING
slowliness: being single threaded, it now takes a lot to complete (when and if it completes)

connection reset by peer[104] : it’s an error throwed by solr after a while it’s been queried (i increased the pool size, but nothing changes) this is the most recurrent (and annoying) error, at the moment.

python hanging: this i resolved with a timeout decorator (which isn’t a correct solution, but at least it helps me go throu the whole processing and have a quick test output for now. i’ll drop this whenever i can come to a smart solution)

queue max size: a queue object can contain up to 32k elements, so it won’t fit my numbers

WHAT I’M USING
python 2.7
mysql
apache-solr
sunburnt (python interface to solr)
linux box

I don’t need any code debugging, since i’d rather throw away what i did for a fresh start, instead than patching it over and over and over… “Trial by error” is not what i like.

I’d like every suggestion that can come in mind to you to design this in the correct way. Also links, websites, guides are very much welcomed, since my experience with this kind of scripts is building as i work.

Thanks all in advance for your help! If you didn’t understand something, just ask, i’ll answer/update the post if needed!

EDIT BASED ON SOME ANSWERS (will keep this updated)
i’ll probably drop python threads for the multiprocessing lib: this could solve my performance issues

divide-and-conquer based construction method: this should add some logic in my parameters construction, without needing any bruteforce approac

what i still need to know: where can i store my combinations to feed the worker thread? maybe this is no more an issue, since the divide-and-conquer approach may let me generate runtime the combinations and split them between the working threads.

NB: i wont’ accept any answer for now, since i’d like to mantain this post alive for a while, just to gather more and more ideas (not only for me, but maybe for future reference of others, since it’s generic nature)

Thanks all again!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T15:07:33+00:00Added an answer on May 30, 2026 at 3:07 pm

    Instead of brute force, change to using a divide-and-conquer approach while keeping track of the number of hits for each search. If you subdivide into certain combinations, some of those sets will be empty so you eliminate many subtrees at once. Add missing parameters into remaining searches and repeat until you are done. It takes more bookkeeping but many fewer searches.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I have a text area in my form which accepts all possible characters from
In my XML file chapters tag has more chapter tag.i need to display chapters
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I am currently running into a problem where an element is coming back from
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I need to clean up various Word 'smart' characters in user input, including but
I need a function that will clean a strings' special characters. I do NOT
I have thousands of HTML files to process using Groovy/Java and I need to
Is it possible to replace javascript w/ HTML if JavaScript is not enabled on

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.