Here’s my code..
require "open-uri"
base_url = "http://en.wikipedia.org/wiki"
(1..5).each do |x|
# sets up the url
full_url = base_url + "/" + x.to_s
# reads the url
read_page = open(full_url).read
# saves the contents to a file and closes it
local_file = "my_copy_of-" + x.to_s + ".html"
file = open(local_file,"w")
file.write(read_page)
file.close
# open a file to store all entrys in
combined_numbers = open("numbers.html", "w")
entrys = open(local_file, "r")
combined_numbers.write(entrys.read)
entrys.close
combined_numbers.close
end
As you can see. It basically scrapes the contents of the wikipedia articles 1 through 5 and then attempts to combine them nto a single file called numbers.html.
It does the first bit right. But when it gets to the second. It only seem’s to write in the contents of the fifth article in the loop.
I can’t see where im going wrong though. Any help?
You chose the wrong mode when opening your summary file. “w” overwrites existing files while “a” appends to existing files.
So use this to get your code working:
Otherwise with each pass of the loop the file contents of numbers.html are overwritten with the current article.
Besides I think you should use the contents in
read_pageto write tonumbers.htmlinstead of reading them back in from your freshly written file: