I have a table like the one below and would like to calculate the different combinations of factors present. example the number of time all are present (1 indicates presence and 0 is for absence). number of time first is absent but rest are present, number of time second is absent but others are present and and also for doubles and triples absent and rest being present.
In shell it is quite simple to check the number of time all are present
awk ‘{if (($2 == 1) && ($3==1) && ($4==1) && ($5==1) && ($6==1)) print $1}’ALL_Freq_motif_AE_Uper
but the problem is of computing all possible combinations present.
the table looks like this:
CEBP HEB TAL1 RUNX1 SPI1
1 1 1 1 1
0 1 1 1 1
1 1 0 0 1
1 1 1 1 0
0 0 0 1 1
Now different combination arises from this table
1 combination where all are present.
2 first is absent and all others are present
3 last is absent but others are present
4 third and fourth are absent but others are present
5 first three absent but others are present.
In a table like this which has a fixed number of columns and n number of rows, how can I compute these combinations of presence and absence?
Kindly help.
Thank you
Assuming that
datacontains your data, this could do the job:or, using the collections module:
EDIT: Another version saving resources using a generator expression
I’m sure this could be improved.