I have a large dataframe (~ 600K rows) with a string-value column (link)
doc_id,link
1,http://example.com
1,http://example.com
2,http://test1.net
2,http://test2.net
2,http://test5.net
3,http://test1.net
3,http://example.com
4,http://test5.net
and I would like to count the number of times a certain string value occurs in the frame. The result should look like this:
link, count
http://example.com, 3
http://test1.net, 2
http://test2.net, 1
http://test5.net, 2
Is there an efficient way to do this in R? Converting the frame into a matrix doesn’t work because of the frame size. Currently I am using the plyr package, but this is too slow.
The
tablefunction counts occurrences – and it’s very fast compared toddply. So, something like this perhaps:Which gives the following output: