I am looking for real world applications where topological sorting is performed on large graph sizes.
Some fields where I image you could find such instances would be bioinformatics, dependency resolution, databases, hardware design, data warehousing… but I hope some of you may have encountered or heard of any specific algorithms/projects/applications/datasets that require topsort.
Even if the data/project may not be publicly accessible any hints (and estimates on the order of magnitude of potential graph sizes) might be helpful.
Here are some examples I’ve seen so far for Topological Sorting:
While scheduling task graphs in a distributed system, it is usually
needed to sort the tasks topologically and then assign them to
resources. I am aware of task graphs containing more than 100,000
tasks to be sorted in a topological order. See this in this context.
Once upon a time I was working on a Document Management System. Each
document on this system has some kind of precedence constraint to a
set of other documents, e.g. its content type or field referencing.
Then, the system should be able to generate an order of the documents
with the preserved topological order. As I can remember, there were
around 5,000,000 documents available two years ago !!!
In the field of social networking, there is famous query to know the
largest friendship distance in the network. This problem needs to
traverse the graph by a BFS approach, equal to the cost of a
topological sorting. Consider the members of Facebook and find your
answer.
If you need more real examples, do not hesitate to ask me. I have worked in lots of projects working on on large graphs.
P.S. for large DAG datasets, you may take a look at Stanford Large Network Dataset Collection and Graphics@ Illinois page.