I have multiple .csv files with a large matrix, ~300 rows and 2000 cols.

Question

0

Asked: June 11, 20262026-06-11T16:53:05+00:00 2026-06-11T16:53:05+00:00

I have multiple .csv files with a large matrix, ~300 rows and 2000 cols.

0

I have multiple .csv files with a large matrix, ~300 rows and 2000 cols. I want to develop a new matrix table for each row by selecting the entire columns that have an equal value that is 1 . I would like to keep the row and column names and would like to create file with row names in a directory.

This the example of the datasets:

       pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
WHIRR-25        1              0          1            0                 1              1                1
WHIRR-28        1              0          1            0                 0              1                0
WHIRR-55        0              0          1            0                 0              0                0
WHIRR-61        0              0          0            0                 0              1                0
WHIRR-76        0              0          1            0                 0              0                0
WHIRR-87        1              1          1            0                 0              1                1
WHIRR-92        1              0          0            1                 0              1                1

So this data sets will develop an output like as below:

    Whirr-25
                   pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
 WHIRR-25        1              0          1            0                 1              1                1
 WHIRR-28        1              0          1            0                 0              1                0
 WHIRR-55        0              0          1            0                 0              0                0
 WHIRR-61        0              0          0            0                 0              1                0
 WHIRR-76        0              0          1            0                 0              0                0
 WHIRR-87        1              1          1            0                 0              1                1
 WHIRR-92        1              0          0            1                 0              1                1

Whirr-28
                pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
 WHIRR-28        1              0          1            0                 0              1                0
 WHIRR-55        0              0          1            0                 0              0                0
 WHIRR-61        0              0          0            0                 0              1                0
 WHIRR-76        0              0          1            0                 0              0                0
 WHIRR-87        1              1          1            0                 0              1                1
 WHIRR-92        1              0          0            1                 0              1                1

Whirr-55

             pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java

WHIRR-55        0              0          1            0                 0              0                0
WHIRR-76        0              0          1            0                 0              0                0
WHIRR-87        1              1          1            0                 0              1                1

Whirr-61
          pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
WHIRR-61        0              0          0            0                 0              1                0
WHIRR-87        1              1          1            0                 0              1                1
WHIRR-92        1              0          0            1                 0              1                1

Whirr-76
               pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
WHIRR-76        0              0          1            0                 0              0                0
WHIRR-87        1              1          1            0                 0              1                1

Whirr-87
              pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
WHIRR-87        1              1          1            0                 0              1                1
WHIRR-92        1              0          0            1                 0              1                1

Whirr-92
              pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
 WHIRR-92        1              0          0            1                 0              1                1

I applied this script, but the script only create new table based on columns, not rows:

 dat <- read.table(file="Task_vs_Files_Proj.csv", header=T, sep=",", row.names=1) 
    dat

    apply( sapply(dat , function(x) return( as.logical(x) ) ), 2, function(x) dat[x, ])

$pom.xml.
        pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
    WHIRR-25        1              0          1            0                 1              1                1
    WHIRR-28        1              0          1            0                 0              1                0
    WHIRR-87        1              1          1            0                 0              1                1
    WHIRR-92        1              0          0            1                 0              1                1

    $ZooKeeper.java
             pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
    WHIRR-87        1              1          1            0                 0              1                1

    $HBase.java
             pom.xml. ZooKeeper.java HBase.java Hadoop.java. BasicServer.java. Abstract.java. HBaseRegion.java
    WHIRR-25        1              0          1            0                 1              1                1
    WHIRR-28        1              0          1            0                 0              1                0
    WHIRR-55        0              0          1            0                 0              0                0
    WHIRR-76        0              0          1            0                 0              0                0
    WHIRR-87        1              1          1            0                 0              1                1

Appreciate help from the expert here…Thank you

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T16:53:07+00:00

As far as I can see, you want

# cycle through all rows
for(which_row in seq_len(nrow(.data))){
  # get the subset of the rows from this row 
  subset_data <- .data[which_row:nrow(.data),]
  # which elements for each column == 1
  which_one <- lapply(subset_data, function(x){which(as.logical(x))})
  # drop the columns where there are no 1's
  which_one <- Filter(function(x){length(x) >0},which_one)
  # filter to those which == 1, and then get the unique combination
  # of rows (sorted to original order)
  which_rows <- sort(Reduce(union,Filter(function(x) {1 %in% x}, which_one)))
  # the file name
  file_name <- sprintf('file_%s.csv', row.names(.data)[which_row])
  # save
  write.csv(subset_data[which_rows,], file_name, row.names = T)
  # prints the data set to the console for checking
  print(subset_data[which_rows,])
  # message to show what file is created
  message(sprintf('Saving %s', file_name))
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have multiple .csv files with a large matrix, ~300 rows and 2000 cols.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply