Input file is Mydata <- read.table(con <- textConnection(‘ gene treatment1 treatment2 treatment3 aaa 1

Question

0

Asked: May 21, 20262026-05-21T14:47:35+00:00 2026-05-21T14:47:35+00:00

Input file is Mydata <- read.table(con <- textConnection(‘ gene treatment1 treatment2 treatment3 aaa 1

0

Input file is

Mydata <- read.table(con <- textConnection('
gene treatment1 treatment2 treatment3
aaa 1 0 1
bbb 1 1 1
ccc 0 0 0
eee 0 1 0
'), header=TRUE)
close(con)

Mydata is

  gene treatment1 treatment2 treatment3
1  aaa          1          0          1
2  bbb          1          1          1
3  ccc          0          0          0
4  eee          0          1          0

In order to built cluster, I have done

d <- dist(mydata, method = "euclidean")
fit <- hclust(d, method="ward") 
plot(fit)

I got the cluster based on “euclidean” distance.

In my previous message in stackoverflow
How to use R to compute Tanimoto/Jacquard Score as distance matrix

I found I can also calculate tanimoto-jacquard distance matrix with R. Could you mind to teach me how to incorporate tanimoto-jacquard with the previous steps to get a cluster based on distance matrix calculated by tanimoto-jacquard distance instead of euclidean? Thanks a lot.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T14:47:36+00:00

What is it you don’t understand? ?vegdist tells us that it returns an object of class "dist" so you can just remove the dist(....) line and replace it with one calling vegdist(....). For example:

require(vegan)
d <- vegdist(Mydata[, -1], method = "jaccard")
fit <- hclust(d, method="ward") 
plot(fit)

You need to drop the first column (and should have done in the Euclidean version you showed in your Q) as this is not data that should be used to form the dissimilarity matrix.

That will generate a warning:

Warning message:
In vegdist(Mydata[, -1], method = "jaccard") :
  you have empty rows: their dissimilarities may be meaningless in method jaccard

because row 3 contains no information to form the jaccard distance between it and the other samples. You might want to consider if the jaccard is most appropriate in such cases.

The OP now wants the gene labels as row names. The easiest option is to tell R this when reading the data in, using the row.names argument to read.table():

mydata2 <- read.table(con <- textConnection("gene treatment1 treatment2 treatment3
aaa 1 0 1
bbb 1 1 1
ccc 0 0 0
eee 0 1 0
"), header = TRUE, row.names = 1)
close(con)

giving:

> mydata2
    treatment1 treatment2 treatment3
aaa          1          0          1
bbb          1          1          1
ccc          0          0          0
eee          0          1          0

Or if the data are already in R and it is a pain to reload and redo previous computations, just assign the gene column to the row names and remove the gene column (using the original mydata):

rownames(mydata) <- mydata$gene
mydata <- mydata[, -1]

giving:

> mydata
    treatment1 treatment2 treatment3
aaa          1          0          1
bbb          1          1          1
ccc          0          0          0
eee          0          1          0

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Input file is Mydata <- read.table(con <- textConnection(‘ gene treatment1 treatment2 treatment3 aaa 1

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply