My data looks like (example)
ID Col1 Col2
1232 ABCSD abd
2342 ABCSD esw
7643 ABCSD rty
9821 ETHS fvc
I have 2845428 such rows. I want to find out how correlated each pair in Col1 and Col2 is. For example
ABCSD abd 0.64
ETHS fvc 0.23
How can I go about it using R? Thanks
I assume that by correlation you mean something like “what portion of the ABCSD observations have abd in Col2…”
If your data are in a dataframe named df,