Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6739351
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T11:28:39+00:00 2026-05-26T11:28:39+00:00

I have multiple 1.5 GB CSV Files which contain billing information on multiple accounts

  • 0

I have multiple 1.5 GB CSV Files which contain billing information on multiple accounts for clients from a service provider. I am trying to split the large CSV file into smaller chunks for processing and formatting the data inside it.

I do not want to roll out my own CSV parser but this is something I haven’t seen yet so please correct me if I am wrong. The 1.5GB files contains information in the following order: account information, account number, Bill Date, transactions , Ex gst , Inc gst , type and other lines.

note that BillDate here means the date when the invoice was made, so occassionally we have more than two bill dates in the same CSV.

Bills are grouped by : Account Number > Bill Date > Transactions.

Some accounts have 10 lines of Transaction details, some have over 300,000 lines of Transaction details. A large 1.5GB CSV file contains around 8million lines of data (I used UltraEdit before) to cut paste into smaller chunks but this has become very inefficient and a time consuming process.

I just want to load the large CSV files in my WinForm, click a button, which will split this large files in chunks of say no greater than 250,000 lines but some bills are actually bigger than 250,000 lines in which case keep them in one piece and not split accounts across multiple files since they are ordered anyway. Also I do not wan’t accounts with multiple bill date in CSV in which case the splitter can create another additional split.

I already have a WinForm application that does the formatting of the CSV in smaller files automatically in VS C# 2010.

Is it actually possible to process this very large CSV files? I have been trying to load the large files but MemoryOutOfException is an annoyance since it crashes everytime and I don’t know how to fix it. I am open to suggestions.

Here is what I think I should be doing:

  • Load the large CSV file (but fails since OutOfMemoryException). How to solve this?
  • Group data by account name, bill date, and count the number of lines for each group.
  • Then create an array of integers.
  • Pass this array of integers to a file splitter process which will take these arrays and write the blocks of data.

Any suggestions will be greatly appreciated.

Thanks.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T11:28:39+00:00Added an answer on May 26, 2026 at 11:28 am

    Yea about that…. being out of memory is going to happen with files that are HUGE. You need to take your situation seriously.

    As with most problems, break everything into steps.

    I have had a similar type of situation before (large data file in CSV format, need to process, etc).

    What I did:

    Make step 1 of your program suite or whatever, something that merely cuts your huge file into many smaller files. I have broken 5GB zipped up PGP encrypted files (after decryption…thats another headache) into many smaller pieces. You can do something simple like numbering them sequentially (i.e. 001, 002, 003…)

    Then make an app to do the INPUT processing. No real business logic here. I hate FILE IO with a passion when it comes to business logic and I love the warm fuzzy feeling of data being in a nice SQL Server DB. That’s just me. I created a thread pool and have N amount of threads (like 5, you decide how much your machine can handle) read those .csv part files you created.

    Each thread reads one file. One to one relationship. Because it is file I/O, make sure you only dont have too many running at the same time. Each thread does the same basic operation. Reads in data, puts it in a basic structure for the db (table format), does lots of inserts, then ends the thread. I used LINQ to SQL because everything is strongly typed and what not, but to each their own. The better the db design the better for you later to do logic.

    After all threads have finished executing, you have all the data from the original CSV in the database. Now you can do all your business logic and do whatever from there. Not the prettiest solution, but I was forced into developing that given my situation/data flow/size/requirements. You might go with something completely different. Just sharing I guess.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have multiple CSV files which I need to parse in a loop to
I am trying to export data from my MySQL database to multiple CSV files
I have been making a program that creates multiple CSV's from another source CSV
have written a stochastic simulation in Java, which loads data from a few CSV
Requirements : I have a Python project which parses data feeds from multiple sources
I have a static page, which contains multiple links to text files on the
I have this code which extracts data from a csv file and then reformats
Hi I have a ton of data in multiple csv files and filter out
We have multiple MFC apps, which use CMutex( false, blah ), where blah allows
I have multiple classes that all derive from a base class, now some of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.