Apologies as I thought there would be a very obvious answer but I can’t find anything on the net…
I often get very large datasets where missing values are blank e.g. (in short)
#Some description of the dataset
#cover x number of lines
31 3213 313 64 63
31 3213 313 64 63
31 3213 313 64 63
31 3213 313 64 63
31 3213 313 64 63
12 178 190 865
532 31 6164 68
614 131 864 808
I would like to replace all the blanks by, for example, -999. If I use read table such that
dat = read.table('file.txt',skip=2)
I get the error message
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 6 did not have 5 elements
I could open the file as a data frame and do
dat = data.frame('file.txt',skip=2)
is.na(rad1) = which(rad1 == '')
but I don’t know if it would work because I don’t know how to skip the top 2 lines when reading a dataframe (e.g. the equivalent of “skip”) and I couldn’t find the answer either anywhere. Could anyone help?
Thanks.
If you know the widths of each column then you can use
read.fwfe.g.
Although it’s easy to replace
NAvalues with any value you want, that’s just a bad idea, because R has many great way of dealing with NA values.For example, to take the mean of column two, use:
R has other functions to deal with missing data. For example, you can use
na.omit()to completely remove rows with missing data.