I’m working with biological data – namely groups of genes. For example:
group 1: geneA geneB geneC
group 2: geneD geneE
group 3: geneF geneG geneH
For each pair of genes, geneX and geneY I have a score telling how similiar the two genes are (actually, I have two scores, since I used BLAST which is ‘directional’: I first searched geneX against all the other genes then geneY against all the other genes, so I have two geneX--geneY scores, but I guess I can take the lower score of the two, or the average).
So, let’s suppose I have only one score for each pair of genes. My data can be viewed as a undirected graph:

and recall each edge has a score attached to it.
Now, what I would like to do is:
-
Visualize my data interactively: being able to click on gene nodes
and open a link attached to them, show only edges above/below some threshold, control how the network is “spread”, etc. -
Cluster together groups which
are similar, i.e. groups that have
similar genes.
Any ideas of how can I do that? I guess it’s basic clustering and I would appreciate any hints on packages/software that can be of any help here.
Thank you.
You’ll probably get better responses if you ask this over at BioStar, the bioinformatics stackexchange.
Specifically, many of the answers in this thread might be relevant:
Which is the best software to represent biological pathways in a directed graph (network) ?