Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8522345
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T07:05:23+00:00 2026-06-11T07:05:23+00:00

When I read a text file into memory it brings my text in with

  • 0

When I read a text file into memory it brings my text in with ‘\n’ at the end due to the new lines.

["Hello\n", "my\n", "name\n", "is\n", "John\n"] 

Here is how I am reading the text file

array = File.readlines('text_file.txt')

I need to do a lot of processing on this text array, so I’m wondering if I should remove the “\n” when I first create the array, or when I do the processing on each element with regex, performance wise.

I wrote some (admittedly bad) test code to remove the “\n”

array = []
File.open('text_file.txt', "r").each_line do |line|
    data = line.split(/\n/)
    array << data
end
array.flatten!

Is there a better way to do this if I should remove the “\n” when I first create the array?

If I wanted to read the file into a Set instead(for performance), is there a method similar to readlines to do that?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T07:05:24+00:00Added an answer on June 11, 2026 at 7:05 am

    You need to run a benchmark test, using Ruby’s built-in Benchmark to figure out what is your fastest choice.

    However, from experience, I’ve found that “slurping” the file, i.e., reading it all in at once, is not any faster than using a loop with IO.foreach or File.foreach. This is because Ruby and the underlying OS do file buffering as the reads occur, allowing your loop to occur from memory, not directly from disk. foreach will not strip the line-terminators for you, like split would, so you’ll need to add a chomp or chomp! if you want to mutate the line read in:

    File.foreach('/path/to/file') do |li|
      puts li.chomp
    end
    

    or

    File.foreach('/path/to/file') do |li|
      li.chomp!
      puts li
    end
    

    Also, slurping has the problem of not being scalable; You could end up trying to read a file bigger than memory, taking your machine to its knees, while reading line-by-line will never do that.


    Here’s some performance numbers:

    #!/usr/bin/env ruby
    
    require 'benchmark'
    require 'fileutils'
    
    FILENAME = 'test.txt'
    LOOPS = 1
    
    puts "Ruby Version: #{RUBY_VERSION}"
    puts "Filesize being read: #{File.size(FILENAME)}"
    puts "Lines in file: #{`wc -l #{FILENAME}`.split.first}"
    
    Benchmark.bm(20) do |x|
      x.report('read.split')           { LOOPS.times { File.read(FILENAME).split("\n") }}
      x.report('read.lines.chomp')     { LOOPS.times { File.read(FILENAME).lines.map(&:chomp) }}
      x.report('readlines.map.chomp1') { LOOPS.times { File.readlines(FILENAME).map(&:chomp) }}
      x.report('readlines.map.chomp2') { LOOPS.times { File.readlines(FILENAME).map{ |s| s.chomp } }}
      x.report('foreach.map.chomp1')   { LOOPS.times { File.foreach(FILENAME).map(&:chomp) }}
      x.report('foreach.map.chomp2')   { LOOPS.times { File.foreach(FILENAME).map{ |s| s.chomp } }}
    end
    

    And the results:

    Ruby Version: 1.9.3
    Filesize being read: 42026131
    Lines in file: 465440
                               user     system      total        real
    read.split             0.150000   0.060000   0.210000 (  0.213365)
    read.lines.chomp       0.470000   0.070000   0.540000 (  0.541266)
    readlines.map.chomp1   0.450000   0.090000   0.540000 (  0.535465)
    readlines.map.chomp2   0.550000   0.060000   0.610000 (  0.616674)
    foreach.map.chomp1     0.580000   0.060000   0.640000 (  0.641563)
    foreach.map.chomp2     0.620000   0.050000   0.670000 (  0.662912)
    

    On today’s machines a 42MB file can be read into RAM pretty safely. I have seen files a lot bigger than that which won’t fit into the memory of some of our production hosts. While foreach is slower, it’s also not going to take a machine to its knees by sucking up all memory if there isn’t enough memory.

    On Ruby 1.9.3, using the map(&:chomp) method, instead of the older form of map { |s| s.chomp }, is a lot faster. That wasn’t true with older versions of Ruby, so caveat emptor.

    Also, note that all the above processed the data in less than one second on my several years old Mac Pro. All in all I’d say that worrying about the load speed is premature optimization, and the real problem will be what is done after the data is loaded.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Possible Duplicate: Python: How to read huge text file into memory To process a
I'm trying to read values from a text file into a hashtable, I want
I would like to read a text file and input its contents into an
In php how can I read a text file and get each line into
I have the current code that readings a text file into memory: std::streampos fsize
I'm wanting to read hex numbers from a text file into an unsigned integer
I am reading about 6000 text-files into memory with the following code in a
I want to read the contents of a text file into a char array
I'm trying to read in a 150mb text file into a Rich Text box.
i use LineNumberReader to read text file , when call setLineNumber and getLinenumber it

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.