I have a tab-delimited DAT file that I want to read into R. When I import the data using read.delim, my data frame has the correct number of columns, but has more rows than expected.
My datafile represents responses to a survey. After digging a little deeper, it appears that R is creating a new record when there is a “.” in a column that represents an open-ended response. It appears that there are times when a respondent may have hit “enter” to add a new line.
Is there a way to get around this? I read the help, but I am not sure how I can tell R to ignore this character in the character response.
Here is an example response that parses incorrectly. This is one response, but you can see that there are returns that put this onto multiple lines when parsed by R.
possible ask for size before giving free tshirt.
Also maybe have the interview in conference rooms instead of tight offices. I felt very cramped.
I would of loved to have gone, but just had to make a choices and had more options then I expected.
I am analyzing the data with SPSS and the data were brought in fine, however, I need to use R for more advanced modeling
Any help will be greatly appreciated. Thanks in advance.
There is an ‘na.strings’ argument. You don’t offer any test case, but perhaps you can to this:
I think it would be good if you could produce an edit to your question that better demonstrated the problem. I cannot create an error with a simple effort:
(After the clarification that above comments are not particularly relevant.) This will bring in a field that has a linefeed in it …. but it requires that the “field” be quoted in the original file: