Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8012711
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T19:24:37+00:00 2026-06-04T19:24:37+00:00

I have a collection of documents in MongoDB where each has one or more

  • 0

I have a collection of documents in MongoDB where each has one or more categories in a list. Using map reduce, I can get the details of how many documents have each unique combination of categories:

['cat1']               = 523
['cat2']               = 231
['cat3']               = 102
['cat4']               = 72
['cat1','cat2']        = 710
['cat1','cat3']        = 891
['cat1','cat3','cat4'] = 621 ...

where the totals are for the number of documents that exact combination of categories.

I’m looking for a sensible way to present this data, and I think a venn diagram with proportional areas would be a good idea. Using the above example, the area cat1 would be 523+710+891+621, the area of the overlap between cat1 and cat3 would be 891+621, the area of overlap between cat1, cat3, cat4 would be 621 etc.

Does anyone have any tips for how I might go about implementing this? I’d preferably like to do it in Python (+Numpy/MatPlotLib) or MatLab.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T19:24:38+00:00Added an answer on June 4, 2026 at 7:24 pm

    The Problem

    We need to represent counts of multiple interconnected categories of object, and a Venn diagram would be unable to represent more than a trivial amount of categories and their overlap.

    A Solution

    Consider each of the categories and their combinations as a node in a graph. Draw the graph such that the size of the node represents the count in each category, and the edges connect the related categories. The advantage of this approach is: multiple categories can be accommodated with ease, and this becomes a type of connected bubble chart.

    The Result

    network layout

    The Code

    The proposed solution uses NetworkX to create the data structure and matplotlib to draw it. If data is presented in the right format, this will scale to a large number of categories with multiple connections.

    import networkx as nx
    import matplotlib.pyplot as plt
    
    def load_nodes():
        text = '''  Node    Size
                    1        523
                    2        231
                    3        102
                    4         72
                    1+2      710
                    1+3      891
                    1+3+4    621'''
        # load nodes into list, discard header
        # this may be replaced by some appropriate output 
        # from your program
        data = text.split('\n')[1:]
        data = [ d.split() for d in data ]
        data = [ tuple([ d[0], 
                        dict( size=int(d[1]) ) 
                        ]) for d in data]
        return data
    
    def load_edges():
        text = '''  From   To
                    1+2    1
                    1+2    2
                    1+3    1
                    1+3    3
                    1+3+4    1
                    1+3+4    3
                    1+3+4    4'''
        # load edges into list, discard header
        # this may be replaced by some appropriate output 
        # from your program
        data = text.split('\n')[1:]
        data = [ tuple( d.split() ) for d in data ]
        return data
    
    if __name__ == '__main__':
        scale_factor = 5
        G = nx.Graph()
        nodes = load_nodes()
        node_sizes = [ n[1]['size']*scale_factor
                      for n in nodes ]
    
        edges = load_edges()
        G.add_edges_from( edges )
    
        nx.draw_networkx(G, 
                         pos=nx.spring_layout(G),
                         node_size = node_sizes)
        plt.axis('off')
        plt.show()
    

    Other Solutions

    Other solutions might include: bubble charts, Voronoi diagrams, chord diagrams, and hive plots among others. None of the linked examples use Python; they are just given for illustrative purposes.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have 10000 documents in one MongoDB collection. I'd like to update all the
I have mongodb running and using morphia. Having a Collection of BatchData Documents and
I have a collection with nested documents in it. Each document also has an
In MongoDB I have a collection of documents called 'clients', where each document is
Using PHP and MongoDB, i have a collection called users and another one called
I have a MongoDB collection of documents, each document representing a fish . Users
I have a >6M documents collection in mongodb. And one of it's fields (field1
I have a MongoDB collection which has a created_at stored in each document. These
i have a mongoDB collection named col that has documents that look like this
created a collection in MongoDB consisting of 11446615 documents. Each document has the following

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.