Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 53319
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T17:02:41+00:00 2026-05-10T17:02:41+00:00

I have a large collection of data in an excel file (and csv files).

  • 0

I have a large collection of data in an excel file (and csv files). The data needs to be placed into a database (mysql). However, before it goes into the database it needs to be processed..for example if columns 1 is less than column 3 add 4 to column 2. There are quite a few rules that must be followed before the information is persisted.

What would be a good design to follow to accomplish this task? (using java)

Additional notes

The process needs to be automated. In the sense that I don’t have to manually go in and alter the data. We’re talking about thousands of lines of data with 15 columns of information per line.

Currently, I have a sort of chain of responsibility design set up. One class(Java) for each rule. When one rule is done, it calls the following rule.

More Info

Typically there are about 5000 rows per data sheet. Speed isn’t a huge concern because this large input doesn’t happen often.

I’ve considered drools, however I wasn’t sure the task was complicated enough for drols.

Example rules:

  1. All currency (data in specific columns) must not contain currency symbols.

  2. Category names must be uniform (e.g. book case = bookcase)

  3. Entry dates can not be future dates

  4. Text input can only contain [A-Z 0-9 \s]

etc..
Additionally if any column of information is invalid it needs to be reported when processing is complete (or maybe stop processing).

My current solution works. However I think there is room for improvement so I’m looking for ideals as to how it can be improved and or how other people have handled similar situations.

I’ve considered (very briefly) using drools but I wasn’t sure the work was complicated enough to take advantage of drools.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T17:02:42+00:00Added an answer on May 10, 2026 at 5:02 pm

    If I didn’t care to do this in 1 step (as Oli mentions), I’d probably use a pipe and filters design. Since your rules are relatively simple, I’d probably do a couple delegate based classes. For instance (C# code, but Java should be pretty similar…perhaps someone could translate?):

    interface IFilter {    public IEnumerable<string> Filter(IEnumerable<string> file) {    } }  class PredicateFilter : IFilter {    public PredicateFilter(Predicate<string> predicate) { }     public IEnumerable<string> Filter(IEnumerable<string> file) {       foreach (string s in file) {          if (this.Predicate(s)) {             yield return s;          }       }    } }  class ActionFilter : IFilter {   public ActionFilter(Action<string> action) { }    public IEnumerable<string> Filter(IEnumerable<string> file) {       foreach (string s in file) {          this.Action(s);          yield return s;       }   } }  class ReplaceFilter : IFilter {   public ReplaceFilter(Func<string, string> replace) { }    public IEnumerable<string> Filter(IEnumerable<string> file) {      foreach (string s in file) {         yield return this.Replace(s);      }   } } 

    From there, you could either use the delegate filters directly, or subclass them for the specifics. Then, register them with a Pipeline that will pass them through each filter.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We have a large internal data collection website. I don't have time to create
I have a large collection of 300 question objects in a database test .
I have a large amount of data stored in an XML file, 173 MB
I have a large collection of data chunks sized 1kB (in the order of
I have a large amount of data that is retrieved from a database. They
I have a large collection of roughly 3.2 million records, this collection data is
I have data database containing some rather large strings, each of which holds a
I have a large amount of data stored in a Collection . I would
I have a large collection of tab separated text data in the form of
I have to following problem. I've large collection of XML files. In each XML

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.