In my theoretical data analytics course, I have to find a co-occurence matrix for courses which were taken together by students.
I was asked to do it manually since this is not a programming course. The problem is, the question I have has 20 courses and 20 students!
Is there any programming language like Octave/Matlab or Python which have api functions to do this?
If not where can I find an algorithm to find the the co-occurrence matrix. I can convert the algorithm into code. Googling didn’t lead me to the algorithm.
An example of the data I have:
IT 421 540 531 582
IT 549 561 528 578
IT 571 532 563 543
IT 421 561 571 519
IT 540 563 578 411
IT 411 421 581 578
IT 421 540 561 411
IT 541 548 528 546
IT 411 422 540 571
........
........
Thank you.
A simple approach is to use spreadsheet software (for example oocalc). Make one row per student, and one column per course; put a 1 (or other nonblank mark) in appropriate cells, and use a sum(range) function to count number of marks in each column.
Update: In the above approach one manually inputs an incidence matrix; which I previously assumed was what you wanted to create. However, you might mean something else by “co-occurence matrix”. Anyhow, programs in languages like Python, Perl, Awk, Ruby will take no more than a handful of lines to output an incidence matrix from data in the form shown. For example, the lengthy Perl program below can be adapted to print a matrix instead of a table:
As is, from the data sample you gave it produces output like:
when given the data via standard input.