Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6947325
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T13:40:45+00:00 2026-05-27T13:40:45+00:00

I have an issue while building my Solr index (Lucene & Solr 3.4.0 on

  • 0

I have an issue while building my Solr index (Lucene & Solr 3.4.0 on an Apache Tomcat 6.0.33).

The data for the documents to index comes out of an Oracle database. Since I have to handle loads of CLOBs, I splitted up the dataimport into several requestHandlers to increase the performance while fetching the data from the database (multithreading simulation). These requestHandlers are configured in my solrconfig.xml as follows:

<requestHandler name="/segment-#" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
        <str name="config">segment-#.xml</str>
    </lst>
</requestHandler>

To build the index, I start the first DataImportHandler with the clean=true option and then start the full-import of all other segments. When all segments are through, the status pages (http://host/solr/segment-#) tell me, that for each segment the correct number of rows (according to the SELECT COUNT(*) statement in the database) was fetched and processed. Fine so far.

But if I now call the status page of the core (http://host/solr/admin/core) the numDocs is not the sum of all segments. There are always some documents missing. I tried the index build several times, the difference was always varying. In sum there should be 8.3 million documents in the index, but actually there are always roundabout 100.000 entries missing. The numDocs is the same number that I can find with a *:* query via the Solr admin interface.

I turned on the infostream, had a look at the log entries, also the Tomcat logs but did not find a clue. What am I doing wrong?

I am using 17 requestHandlers and my <indexDefaults> are configured as follows:

<useCompoundFile>false</useCompoundFile>
<mergeFactor>17</mergeFactor>
<ramBufferSizeMB>32</ramBufferSizeMB>
<maxBufferedDocs>50000</maxBufferedDocs>
<maxFieldLength>2000000</maxFieldLength>
<writeLockTimeout>1000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
<lockType>native</lockType>

Help is very appreciated. Thank you very much in advance!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T13:40:46+00:00Added an answer on May 27, 2026 at 1:40 pm

    I found the problem, just had to RTFM…
    I tricked myself because the default clean option is TRUE, I thought it was FALSE.
    So I just called the first URL with &clean=true instead of calling all other URLs with &clean=false. So each URL call resulted in cleaning the whole index. If I call the URLs with &clean=false, the sum of all documents is correct.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an issue while getting the file name with out extension from a
I have an issue while storing some special character values into db. For e.g.
I have an odd issue while using FlashCS4. I have a textfield that, when
First of all: I'm at Scala 2.8 I have a slight issue while using
I have ran into an interesting issue while trying to create a more usable
We have implemented webservice which generates xml response. I am facing issue while invoking
I have an issue with log4net which has been bugging me for a while
I've been struggling with this issue for a while now. I have OpenCart 1.5.2.1
I have a somehow funny issue. While trying to understand why a certain website
While debugging an issue with our system, I have discovered a thread contention that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.