Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6788975
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T17:32:55+00:00 2026-05-26T17:32:55+00:00

I have 1000 CSV files from dfr.jstor.org with two columns, KEYWORDS and WEIGHT. The

  • 0

I have 1000 CSV files from dfr.jstor.org with two columns, KEYWORDS and WEIGHT. The length of each column varies from file to file. Here’s a snippet of one CSV file:

KEYTERMS  WEIGHT
canoe     1
archaic   0.273
pinus     0.191
florida   0.164

I want to use R to get the KEYTERMS column from each CSV file and merge it into a single data frame like this:

KEYTERMS_CSVFILENAME1 KEYTERMS_CSVFILENAME2 KEYTERMS_CSVFILENAME3
thwart                newsom                period 
dugout                site                  cypress 
sigma                 date                  hartmann 
precontact            NA                    florida 
orange                NA                    NA

Where CSVFILENAME1 is the name of the CSV file where those keywords came from and NA is an empty cell.

I think my problem is very simliar to this one with the difference that I have varying column lengths. This may also be relevant to a solution, and this looks right on topic, but I need a bit of hand-holding to make it suit my situation. Thanks in advance!

  • 1 1 Answer
  • 2 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T17:32:56+00:00Added an answer on May 26, 2026 at 5:32 pm

    To save a LITTLE memory/time you could modify the solution from @Ben Bolker like this:

    datlist <- lapply(csvnames,read.csv, colClasses=c("character", "NULL"))
    rowseq <- seq_len( max(vapply(datlist,nrow, integer(1))) )
    keylist <- lapply(datlist,function(x) { x[[1]][rowseq] ) })
    names(keylist) <- paste(KEYTERMS,csvnames,sep="_")
    #do.call(cbind,keylist)
    do.call(data.frame,keylist)
    

    …I just changed so that only the first column is read, and simplified the NA padding by observing that selecting a sequence that extends outside a character vector pads with NA automatically…

    If you kept the old way of padding, you should at least pad with NA_character_ instead of NA to avoid unnecessary coercion.

    I also index the KEYTERMS column by number instead of name (since there should be only one). I also changed sapply to vapply because I like it better 🙂 – it actually is faster too.

    Finally you said you wanted a data.frame. The last line produces that instead of a matrix.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i have about 1000 .csv files with daily data stretching back 30 years. i
I have a QTableWidget and the first column contains numbers from 1 to 1000.
I have 1000 files in a directory. I'm on a solaris machine I want
I have the following query which have 1000 rows select staffdiscountstartdate,datediff(day,groupstartdate,staffdiscountstartdate), EmployeeID from tblEmployees
I have two enums, in two packages. I could have 1000 enums, un 1000
I have around 1000 html files in my local computer and I have to
I have a Python script that we're using to parse CSV files with user-entered
so, I have this users list which I get it from a csv file.
I have a function that updates a MySQL table from a CSV file. The
I have a large csv file with approximately 170 columns' worth of numerical data

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.