Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8462117
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T14:04:49+00:00 2026-06-10T14:04:49+00:00

I have a problem I would like some help at. I need to create

  • 0

I have a problem I would like some help at. I need to create a piece of R code that can load in a csv file. The csv file contain one column named “Link” and for each i(Row) there is a link from which the code need to download the content of the link and place it in a separate csv file. Until now I have managed find and modify the piece of code showed below. (Thanks to Christopher Gandrud and co authors)

library(foreign)
library(RCurl)

addresses <- read.csv(">>PATH TO CSV FILE<<")

for (i in addresses) full.text <- getURL(i)

text <- data.frame(full.text)

outpath <-">>PATH TO SPECIFIED FOLDER<<"

x <- 1:nrow(text)

for(i in x) {
  write(as.character(text[i,1]), file = paste(outpath,"/",i,".txt",sep=""))
}

Actually the code works perfectly, BUT the problem is that I am overloading the server with requests, so after having downloaded the correct content from 100-150 links, the files are just empty. I know for a fact that this is the problem since I have tested it many times with a decreasing number of links. Actually if I just download 100 links at the time it is no problem. Above 100 it starts becoming a problem. Non the less I need to implement a couple of things into this piece of code for it to become a good crawler for this particular task.

I have divided my problem into two because solving problem one should solve the case temporarily.

  1. I want to use the Sys.Sleep function for every 100 downloads. So the code fires 100 requests for the first 100 links and then it pauses for x seconds before it fires the next 100 requests…

  2. Having done that with all rows/links in my dataset/csv file I need it to check each csv file for two conditions. They cannot be empty and they cannot contain a certain error message the server gives me in some special cases. If one of these two condtions are true then it need to save the filename(link number) into a vector I can work with from there.

Wow this question suddenly got pretty long. I realize it is a big question and I am asking a lot. It is for my master thesis which is not really about R programming but I need to download the content from a lot of websites which I have been given access to. Next I have to analyze the content, which is what my thesis is about. Any suggestions/comments are welcome.


 library(foreign)  
 library(RCurl)  

 addresses <- read.csv("~/Dropbox/Speciale/Mining/Input/Extract post - Dear Lego n(250).csv")  

 for (i in addresses) {  
+   if(i == 50) {  
+     print("Why wont this work?")  
+     Sys.sleep(10)  
+     print(i)  
+   }  
+   else {  
+     print(i)  
+   }  
+ }  

“And then a whole list over the links loaded in. No “Why wont this work” at i == 50″ followed by

Warning message

In if (i == 100) {:
 the condition has length > 1 and only the first element will be used  
full.text <- getURL(i)  
text <- data.frame(full.text)  
outpath <-"~/Dropbox/Speciale/Mining/Output"  
x <- 1:nrow(text)  
for(i in x) {  
write(as.character(text[i,1]), file = paste(outpath,"/",i,".txt",sep=""))}  

Able to help me more?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T14:04:51+00:00Added an answer on June 10, 2026 at 2:04 pm

    FINAL SOLUTION:

    > library(RCurl)  
    > library(foreach)  
    > library(foreign)
    
    > z <- nrow(links)  
    > outpath <-"SPECIFIC PATH"
    
    > foreach(i=1:z) %do% {    
    +  text <- getURL(links[i,])    
    +  write(as.character(text), file = paste(outpath,"/",i,".txt",sep=""))}
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i have the following problem: i would like to create a footer. that footer
I have an interesting genetics problem that I would like to solve in native
I am trying to create a csv file of some data. I have wrote
I need some help with the following code... How do I create the array
I have a problem i would like parallelize two for loops with openmp. how
Ok, I have the following problem: I would like to scroll an overflowing ListBox
I have a little problem where I would like to insert a svn diff
I have a reasonably complex layout problem: I would like to have a main
I have a problem, I would like to build logging system which will be
I have a problem with apache2 settings (Ubuntu system). I would like to run

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.