Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1033359
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T14:13:14+00:00 2026-05-16T14:13:14+00:00

I have a 100 GB text file, which is a BCP dump from a

  • 0

I have a 100 GB text file, which is a BCP dump from a database. When I try to import it with BULK INSERT, I get a cryptic error on line number 219506324. Before solving this issue I would like to see this line, but alas my favorite method of

import linecache
print linecache.getline(filename, linenumber)

is throwing a MemoryError. Interestingly the manual says that “This function will never throw an exception.” On this large file it throws one as I try to read line number 1, and I have about 6GB free RAM…

I would like to know what is the most elegant method to get to that unreachable line. Available tools are Python 2, Python 3 and C# 4 (Visual Studio 2010). Yes, I understand that I can always do something like

var line = 0;
using (var stream = new StreamReader(File.OpenRead(@"s:\source\transactions.dat")))
{
     while (++line < 219506324) stream.ReadLine(); //waste some cycles
     Console.WriteLine(stream.ReadLine());
}

Which would work, but I doubt it’s the most elegant way.

EDIT: I’m waiting to close this thread, because the hard drive containing the file is being used right now by another process. I’m going to test both suggested methods and report timings. Thank you all for your suggestions and comments.

The Results are in I implemented Gabes and Alexes methods to see which one was faster. If I’m doing anything wrong, do tell. I’m going for the 10 millionth line in my 100GB file using the method Gabe suggested and then using the method Alex suggested which i loosely translated into C#… The only thing I’m adding from myself, is first reading in a 300 MB file into memory just to clear the HDD cache.

const string file = @"x:\....dat"; // 100 GB file
const string otherFile = @"x:\....dat"; // 300 MB file
const int linenumber = 10000000;

ClearHDDCache(otherFile);
GabeMethod(file, linenumber);  //Gabe's method

ClearHDDCache(otherFile);
AlexMethod(file, linenumber);  //Alex's method

// Results
// Gabe's method: 8290 (ms)
// Alex's method: 13455 (ms)

The implementation of gabe’s method is as follows:

var gabe = new Stopwatch();
gabe.Start();
var data = File.ReadLines(file).ElementAt(linenumber - 1);
gabe.Stop();
Console.WriteLine("Gabe's method: {0} (ms)",  gabe.ElapsedMilliseconds);

While Alex’s method is slightly tricker:

var alex = new Stopwatch();
alex.Start();
const int buffersize = 100 * 1024; //bytes
var buffer = new byte[buffersize];
var counter = 0;
using (var filestream = File.OpenRead(file))
{
    while (true) // Cutting corners here...
    {
        filestream.Read(buffer, 0, buffersize);
        //At this point we could probably launch an async read into the next chunk...
        var linesread = buffer.Count(b => b == 10); //10 is ASCII linebreak.
        if (counter + linesread >= linenumber) break;
        counter += linesread;
    }
}
//The downside of this method is that we have to assume that the line fit into the buffer, or do something clever...er
var data = new ASCIIEncoding().GetString(buffer).Split('\n').ElementAt(linenumber - counter - 1);
alex.Stop();
Console.WriteLine("Alex's method: {0} (ms)", alex.ElapsedMilliseconds);

So unless Alex cares to comment I’ll mark Gabe’s solution as accepted.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T14:13:15+00:00Added an answer on May 16, 2026 at 2:13 pm

    Here’s my elegant version in C#:

    Console.Write(File.ReadLines(@"s:\source\transactions.dat").ElementAt(219506323));
    

    or more general:

    Console.Write(File.ReadLines(filename).ElementAt(linenumber - 1));
    

    Of course, you may want to show some context before and after the given line:

    Console.Write(string.Join("\n",
                  File.ReadLines(filename).Skip(linenumber - 5).Take(10)));
    

    or more fluently:

    File
    .ReadLines(filename)
    .Skip(linenumber - 5)
    .Take(10)
    .AsObservable()
    .Do(Console.WriteLine);
    

    BTW, the linecache module does not do anything clever with large files. It just reads the whole thing in, keeping it all in memory. The only exceptions it catches are I/O-related (can’t access file, file not found, etc.). Here’s the important part of the code:

        fp = open(fullname, 'rU')
        lines = fp.readlines()
        fp.close()
    

    In other words, it’s trying to fit the whole 100GB file into 6GB of RAM! What the manual should say is maybe “This function will never throw an exception if it can’t access the file.”

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a login.jsp page which contains a login form. Once logged in the
i have a input tag which is non editable, but some times i need
I have a project that adds elements to an AutoCad drawing. I noticed that
I have a script that appends some rows to a table. One of the
I have a new web app that is packaged as a WAR as part
I have several USB mass storage flash drives connected to a Ubuntu Linux computer
I have a snippet to create a 'Like' button for our news site: <iframe
I have found this example on StackOverflow: var people = new List<Person> { new
Let say I have the following desire, to simplify the IConvertible's to allow me
I want to have generalised email templates. Currently I have multiple email templates with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.