I have a list called training_data. The training_data list contains data read from several

Question

0

Asked: June 17, 20262026-06-17T23:46:06+00:00 2026-06-17T23:46:06+00:00

I have a list called training_data. The training_data list contains data read from several

0

I have a list called “training_data”. The “training_data” list contains data read from several files using the following function.

training_data <- lapply(files, read.table, header=TRUE, sep=",")

I can access the first field of any dataset using the following command:

training_data[[1]][1]           # The first field contains the class "pos OR neg"

I have to use these datasets (contained within training_data) for binary classification using Support Vector Machines (e1071). But the problem is that certain data sets contains only data for one class i.e either all pos or all neg, which is not acceptable for svm function and I want to exclude those datasets. I have tried the following code but unable to access the class column.

training_data<-lapply(training_data, 
                function(data)
                 {
                    if(["the class field is always positive"])
                       ### exclude this dataset from training_data

                 })

Update:
How exactly I can access the first column of data passed to function? And How can I exclude those data sets from training_data which consits of only one class in the class column?

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T23:46:07+00:00

This is what the Filter function was made for. Since you didn’t provide replication code, here is a quick example on how to use Filter. Suppose you have a large list of vectors, each 2 elements in length:

mylist <- lapply(1:1000, function(i) c(runif(1), runif(1)))

Now if you want to only retain those vectors in the list where the first element is greater than 0.5, you would do something like this:

filtered_list <- Filter(function (x) x[1] > 0.5, mylist)

Now, if each element of mylist is a data.frame and the first column in each data.frame is the response vector for the model, as appears to be the case with your data, you can use the data[,1] notation mentioned by Justin to filter out all data.frames that have only positive or negative values in the first column:

filtered_list <- Filter(function (x) { !(all(x[,1] < 0) || all(x[,1] > 0))}, 
                        mylist)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a list called training_data. The training_data list contains data read from several

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply