Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3456390
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T09:43:50+00:00 2026-05-18T09:43:50+00:00

The company I work for produces a content management system (CMS) with different various

  • 0

The company I work for produces a content management system (CMS) with different various add-ons for publishing, e-commerce, online printing, etc. We are now in process of adding “reporting module” and I need to investigate which strategy should be followed. The “reporting module” is otherwise known as Business Intelligence, or BI.

The module is supposed to be able to track item downloads, executed searches and produce various reports out of it. Actually, it is not that important what kind of data is being churned as in the long term we might want to be able to push whatever we think is needed and get a report out of it.

Roughly speaking, we have two options.

Option 1 is to write a solution based on Apache Solr (specifically, using https://issues.apache.org/jira/browse/SOLR-236). Pros of this approach:

  • free / open source / good quality
  • we use Solr/Lucene elsewhere so we know the domain quite well
  • total flexibility over what is being indexed as we could take incoming data (in XML format), push it through XSLT and feed it to Solr
  • total flexibility of how to show search results. Similar to step above, we could have custom XSLT search template and show results back in any format we think is necessary
  • our frontend developers are proficient in XSLT so fitting this mechanism for a different customer should be relatively easy
  • Solr offers realtime / full text / faceted search which are absolutely necessary for us. A quick prototype (based on Solr, 1M records) was able to deliver search results in 55ms. Our estimated maximum of records is about 1bn of rows (this isn’t a lot for typical BI app) and if worse comes to worse, we can always look at SolrCloud, etc.
  • there are companies doing very similar things using Solr (Honeycomb Lexicon, for example)

Cons of this approach:

  • SOLR-236 might or might not be stable, moreover, it’s not yet clear when/if it will be released as a part of official release
  • there would possibly be some stuff we’d have to write to get some BI-specific features working. This sounds a bit like reinventing the wheel
  • the biggest problem is that we don’t know what we might need in the future (such as integration with some piece of BI software, export to Excel, etc.)

Option 2 is to do an integration with some free or commercial piece of BI software. So far I have looked at Wabit and will have a look at QlikView, possibly others. Pros of this approach:

  • no need to reinvent the wheel, software is (hopefully) tried and tested
  • would save us time we could spend solving problems we specialize in

Cons:

  • as we are a Java shop and our solution is cross-platform, we’d have to eliminate a lot of options which are in the market
  • I am not sure how flexible BI software can be. It would take time to go through some BI offerings to see if they can do flexible indexing, real time / full text search, fully customizable results, etc.
  • I was told that open source BI offers are not mature enough whereas commercial BIs (SAP, others) cost fortunes, their licenses start from tens of thousands of pounds/dollars. While I am not against commercial choice per se, it will add up to the overall price which can easily become just too big
  • not sure how well BI is made to work with schema-less data

I am definitely not be the best candidate to find the most approprate integration option in the market (mainly because of absence of knowledge in BI area), however a decision needs to be done fast.

Has anybody been in a similar situation and could advise on which route to take, or even better – advise on possible pros/cons of the option #2? The biggest problem here is that I don’t know what I don’t know 😉

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T09:43:51+00:00Added an answer on May 18, 2026 at 9:43 am

    I have spent some time playing with both QlikView and Wabit, and, have to say, I am quite disappointed.

    I had an expectation that the whole BI industry actually has some science under it but from what I found this is just a mere buzzword. This MSDN article was actually an eye opener. The whole business of BI consists of taking data from well-normalized schemas (they call it OLTP), putting it into less-normalized schemas (OLAP, snowflake- or star-type) and creating indices for every aspect you want (industry jargon for this is data cube). The rest is just some scripting to get the pretty graphs.

    OK, I know I am oversimplifying things here. I know I might have missed many different aspects (nice reports? export to Excel? predictions?), but from a computer science point of view I simply cannot see anything beyond a database index here.

    I was told that some BI tools support compression. Lucene supports that, too. I was told that some BI tools are capable of keeping all index in the memory. For that there is a Lucene cache.

    Speaking of the two candidates (Wabit and QlikView) – the first is simply immature (I’ve got dozens of exceptions when trying to step outside of what was suggested in their demo) whereas the other only works under Windows (not very nice, but I could live with that) and the integration would likely to require me to write some VBScript (yuck!). I had to spend a couple of hours on QlikView forums just to get a simple date range control working and failed because the Personal Edition I had did not support downloadable demo projects available on their site. Don’t get me wrong, they’re both good tools for what they have been built for, but I simply don’t see any point of doing integration with them as I wouldn’t gain much.

    To address (arguable) immatureness of Solr I will define an abstract API so I can move all the data to a database which supports full text queries if anything goes wrong. And if worse comes to worse, I can always write stuff on top of Solr/Lucene if I need to.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I work for a company which produces flash websites. The CMS is built in
The company I work for produces a lot of video and we want to
The company I work for is wanting to add blog functionality to our website
The company I work for develops a system in Delphi, that contains dozens of
The company I work for has recently installed a Apache staging server which uses
The company I work for makes hardware that communicates to the computer though a
the company I work for want to use a hosted payment form to charge
The company I work for currently uses Go To Meeting to share our desktops
The company I work for has recently been hit with many header injection and
The company I work for has several clients. I'm currently splitting my time between

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.