Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 841969
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T05:53:07+00:00 2026-05-15T05:53:07+00:00

I have to read a file from a particular line number and I know

  • 0

I have to read a file from a particular line number and I know the line number say "n":
I have been thinking of two ways:

1.
for i in range(n):
fname.readline()
k=readline()
print k

2.
i=0
for line in fname:
dictionary[i]=line
i=i+1

but I want a faster alternative as I might have to perform this on different files 20000 times.
Are there any better alternatives?

Also, are there are other performance enhancements for simple looping, as my code has nested loops.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T05:53:07+00:00Added an answer on May 15, 2026 at 5:53 am

    If the files aren’t too huge, the linecache module of the standard library is pretty good — it lets you very directly ask for the Nth line of such-and-such file.

    If the files are huge, I recommend something like (warning, untested code):

    def readlinenum(filepath, n, BUFSIZ=65536):
      bufs = [None] * 2
      previous_lines = lines_so_far = 0
      with open(filepath, 'b') as f
        while True:
          bufs[0] = f.read(BUFSIZ)
          if not bufs[0]:
            raise ValueError('File %s has only %d lines, not %d',
                             filepath, lines_so_far, n)
          lines_this_block = bufs[0].count('\n')
          updated_lines_count = lines_so_far + lines_this_block
          if n < updated_lines_count:
              break
          previous_lines = lines_so_far
          lines_so_far = updated_lines_count
          bufs[1] = bufs[0]
        if n == lines_so_far:
          # line split between blocks
          buf = bufs[1] + bufs[0]
          delta = n - previous_lines
        else:  # normal case
          buf = bufs[0]
          delta = n = lines_so_far
        f = cStringIO.StringIO(buf)
        for i, line in enumerate(f):
          if i == delta: break
        return line.rstrip()
    

    The general idea is to read in the file as binary, in large blocks (at least as large as the longest possible line) — the processing (on Windows) from binary to “text” is costly on huge files — and use the fast .count method of strings on most blocks. At the end we can do the line parsing on a single block (two at most in the anomalous case where the line being sought spans block boundaries).

    This kind of code requires careful testing and checking (which I haven’t performed in this case), being prone to off-by-one and other boundary errors, so I’d recommend it only for truly huge files — ones that would essentially bust memory if using linecache (which just sucks up the whole file into memory instead of working by blocks). On a typical modern machine with 4GB bytes of RAM, for example, I’d start thinking about such techniques for text files that are over a GB or two.

    Edit: a commenter does not believe that binary reading a file is much faster than the processing required by text mode (on Windows only). To show how wrong this is, let’s use the 'U' (“universal newlines”) option that forces the line-end processing to happen on Unix machines too (as I don’t have a Windows machine to run this on;-). Using the usual kjv.txt file:

    $ wc kjv.txt
      114150  821108 4834378 kjv.txt
    

    (4.8 MB, 114 Klines) — about 1/1000th of the kind of file sizes I was mentioning earlier:

    $ python -mtimeit 'f=open("kjv.txt", "rb")' 'f.seek(0); f.read()'
    100 loops, best of 3: 13.9 msec per loop
    $ python -mtimeit 'f=open("kjv.txt", "rU")' 'f.seek(0); f.read()'
    10 loops, best of 3: 39.2 msec per loop
    

    i.e., just about exactly a factor of 3 cost for the line-end processing (this is on an old-ish laptop, but the ratio should be pretty repeatable elsewhere, too).

    Reading by a loop on lines, of course, is even slower:

    $ python -mtimeit 'f=open("kjv.txt", "rU")' 'f.seek(0)' 'for x in f: pass'
    10 loops, best of 3: 54.6 msec per loop
    

    and using readline as the commented mentioned (with less efficient buffering than directly looping on the file) is worst:

    $ python -mtimeit 'f=open("kjv.txt", "rU")' 'f.seek(0); x=1' 'while x: x=f.readline()'
    10 loops, best of 3: 81.1 msec per loop
    

    If, as the question mentions, there are 20,000 files to read (say they’re all small-ish, on the order of this kjv.txt), the fastest approach (reading each file in binary mode in a single gulp) should take about 260 seconds, 4-5 minutes, while the slowest one (based on readline) should take about 1600 seconds, almost half an hour — a pretty significant difference for many, I’d say most, actual applications.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a file from which i read line by line and i have
I'm trying to read a file from a local filesystem. I do not have
I have one .net windows application I'm trying to read .xml file from c#
i have csv files, java app and database, i read csv file from my
I want to read the pdf file from raw folder if devices have any
I have an app that needs to read a PDF file from the file
I have an xml file(from federal government's data.gov) which I'm trying to read with
Problem: We have to read from a proprietary binary file at work. It changes
I have a Perl script that uses WWW::Mechanize to read from a file and
I have dynamic array filled with bytes, which are read from .raw file with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.