I’m working on trying to transform a large dataset into the required formats for

Question

0

Asked: June 12, 20262026-06-12T22:31:46+00:00 2026-06-12T22:31:46+00:00

I’m working on trying to transform a large dataset into the required formats for

0

I’m working on trying to transform a large dataset into the required formats for analyzing within the flowstrates package.

What I currently have is a large file (600k trips) with origin and destination points.

Format is sort of like this:

tripID   Month start_pt   end_pt
1        June   1           3
2        June   1           3
3        July   1           5
4        July   1           7
5        July   1           7

What I need to be able to generate is a file that has trip counts by unit time (let’s say months) in a format like this:

start_pt   end_pt  June July August ... December
1          3       2    0    5          9
1          5       0    1    4          4
1          7       0    2    0          0

It’s easy enough to pre-segment the data by month and then generate counts for each origin-destination pair, but then putting it all back together causes all sorts of problems since each of the pre-segmented chunks of data have very different sizes. So it seems that I’d need to do this for the entire dataset at once.

Are there any packages for doing this type of processing? Would it be easier to do this in something like SQL or SQLite?

Thanks in advance for any help.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T22:31:48+00:00

Editorial Team

2026-06-12T22:31:48+00:00Added an answer on June 12, 2026 at 10:31 pm

You can use the reshape2 package to do this fairly easily.

If your data is dat,

library("reshape2")
dcast(dat, start_pt+end_pt~Month, value.var="tripID", fun.aggregate=length)

This gives a single entry for each start_pt/end_pt/Month combination, the value of which is how many cases had that combination (the length of tripID for that set).

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m working on trying to transform a large dataset into the required formats for

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply