Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3288280
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T20:35:22+00:00 2026-05-17T20:35:22+00:00

I’m looking for ideas on how you’d architect a system like so: Records come

  • 0

I’m looking for ideas on how you’d architect a system like so:

Records come in in bulk (say 100,000 at a time) from a variety of sources but primarily a flat text file.

This data needs to be shoved as-is into a SQL Server database table. However, various metrics need to be computed. For example, one field is a certain 4-digit code. Only certain 4-digit codes are valid and we need to track how many records arrived with bad 4-digit codes. There are other fields that need to be “validate” and the list of fields could change in the future.

What is a good design for such a system? Is it best to have events BadFourDigitCodeEncountered and event processors OnBadFourDigitCodeEncountered or is there a cleaner design that is easily maintainable going forward?

(I don’t think it should matter, but I am using NHibernate as my ORM but maybe that is useful to know since NHibernate has various points to hook into?)

I should mention: using C# .NET 4.0.

Thanks in advance,
Arlen

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T20:35:23+00:00Added an answer on May 17, 2026 at 8:35 pm

    For most high-capacity file-to-database processes, I would architect it as an ETVL (extract-transform-validate-load) workflow.

    Extract: Open the file, get the rows of data and put them in a queue to be handled by the transform layer.

    Transform: Grab the raw record data, chop it up into fields you care about and create a new domain object with the field data. Then this object goes in a queue to be handled by the validate layer.

    Validate: Run your domain object through a series of business rules designed to ensure that the record is in a valid, consistent state. Valid objects are marked as such (either by placing them in a “good” queue or by wrapping them in a simple class holding the object and a flag before putting them in a queue) and placed in the last queue for the loader. You can calculate your metrics here per batch, or you can get the metrics real-time by placing “failed” records in another table, with an error code describing what’s wrong, and querying the numbers and causes at your leisure for one batch or many.

    Load: Persist the domain objects to your system’s database.

    Each of these stages should be separate methods or even classes, managed by a “supervisor” process. The beauty of this design is its scalability; if you end up with a lot of validation or transformation logic that slows the process down, you can very easily modify the supervisor to multithread those stages, adding extra processor power where you need it. It’s also modular; if the file format changes, you only have to change the transform stage of the process (maybe the extract if the change is radical enough). If the persistence mechanism changes, you just pop in a new Load layer. Depending on the complexity of your object graph, and thus the complexity of the Transform and Validate stages, I think you’ll find this to be well-able to handle a hundred thousand records at a time.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm looking for suggestions for debugging... If you view this site in Firefox or
I have a jquery bug and I've been looking for hours now, I can't
link Im having trouble converting the html entites into html characters, (&# 8217;) i
Does anyone know how can I replace this 2 symbol below from the string
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
Seemingly simple, but I cannot find anything relevant on the web. What is the
this is what i have right now Drawing an RSS feed into the php,
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I want to count how many characters a certain string has in PHP, but

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.