Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8136415
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T10:45:11+00:00 2026-06-06T10:45:11+00:00

Edit: this question is outdated. The jsonlite package flattens automatically. I am dealing with

  • 0

Edit: this question is outdated. The jsonlite package flattens automatically.

I am dealing with online datastreams that have record-based encoding, usually in JSON. The structure of the object (i.e. the names in the JSON) are known from the API documentation, however, values are mostly optional and not present in every record. Lists can contain new lists, and the structure is sometimes quite deep. Here is a quite simple example of some GPS data: http://pastebin.com/raw.php?i=yz6z9t25. Note that in the lower rows, the "l" object is missing due to no GPS signal.

I am looking for an elegant way to flatten these objects into a dataframe. I am currently using something like this:

library(RJSONIO)
library(plyr)

obj <- fromJSON("http://pastebin.com/raw.php?i=yz6z9t25", simplifyWithNames=FALSE, simplify=FALSE)
flatdata <- lapply(obj$data, as.data.frame);
mydf <- rbind.fill(flatdata)

This does the job, however it is slow and a bit error prone. A problem with this approach is that I am not using my knowledge about the structure (object names) in the data; instead it is inferred from the data. This leads to problems when a certain property happens to be absent in every record. In this case, it will not appear in the dataframe at all, instead of a column with NA values. This can lead to issues downstream. For example, I need to process the location timestamp:

mydf$l.t <- structure(mydf$l.t/1000, class="POSIXct")

However, this will result in an error in case of a dataset in which the l$t object isn’t there. Furthermore both the as.data.frame and rbind.fill make things quite slow. The example dataset is a relatively small one. Any suggestions for better implementation? A robust solution would always yield a dataframe with the same columns in the same order, and where only the number of rows varies.

Edit: below a dataset with more meta data. It is larger in size and nested more deeply:

obj <- fromJSON("http://www.stat.ucla.edu/~jeroen/files/output.json", simplifyWithNames=FALSE, simplify=FALSE)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T10:45:17+00:00Added an answer on June 6, 2026 at 10:45 am

    Just for clarity, I am adding a combination of Josh and Joshua’s solution which is the best I have come up with so far.

    flatlist <- function(mylist){
        lapply(rapply(mylist, enquote, how="unlist"), eval)
    }
    
    records2df <- function(recordlist, columns) {
        if(length(recordlist)==0 && !missing(columns)){
          return(as.data.frame(matrix(ncol=length(columns), nrow=0, dimnames=list(NULL,columns))))
        }
        un <- lapply(recordlist, flatlist)
        if(!missing(columns)){
            ns <- columns;
        } else {
            ns <- unique(unlist(lapply(un, names)))
        }
        un <- lapply(un, function(x) {
            y <- as.list(x)[ns]
            names(y) <- ns
            lapply(y, function(z) if(is.null(z)) NA else z)})
        s <- lapply(ns, function(x) sapply(un, "[[", x))
        names(s) <- ns
        data.frame(s, stringsAsFactors=FALSE)
    }
    

    The function is reasonably fast. I still think it should be able to speed this up though:

    obj <- fromJSON("http://www.stat.ucla.edu/~jeroen/files/output.json", simplifyWithNames=FALSE, simplify=FALSE)
    flatdata <- records2df(obj$data)
    

    It also allows you to ‘force’ certain columns, although it doesn’t result in too much of a speedup:

    flatdata <- records2df(obj$data, columns=c("m", "doesnotexist"))
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

EDIT: Please note that this question is OUTDATED; RVM got way easier to use
Now that 4.0 is public, I can edit this question and ask it again.
EDIT: this question is mostly closed and the only problems i have with this
(EDIT: This question is now outdated for my particular issue, as Google Code supports
edit This question is solved! Having something weird. I'm using html { font-size: 100%
Edit This question has gone through a few iterations by now, so feel free
EDIT: This question is a duplicate of What is the difference between managed and
EDIT: This question was initially too general, I think. So What I really need
EDIT: This question is about finding definitive reference to MySQL syntax on SELECT modifying
[EDIT] This question is how do I do atomic changes to entity beans with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.