I’m running a Flask-based web app that uses Mongodb (with Pymongo for use in Python). Nearly every view access the database, so I want to make the most effective use of memory and CPU resources. I’m unsure what the most efficient method is for instantiating pymongo’s Connection() object, which is used access and manipulate the database. Right now, I declare from pymongo import Connection at the top of my file, and then at the beginning of each view function I have:
def sampleViewFunction():
myCollection = Connection()['myDB']['myCollection']
## then use myCollection to manipulation the database
## more code...
The other way I could do it is declare at the top of my file:
from pymongo import Connection
myCollection = Connection()['myD']['myCollection']
And then later on, your code would just read:
def sampleViewFunction():
## no declaration of myCollection since it's a global variable
## then use myCollection to manipulation the database
## more code...
So the only difference is the declaration scope of myCollection. How do these two methods differ in the way memory is handled and CPU consumption? Since this is a web application, I’m thinking about scenarios where multiple users are the site simultaneously. I imagine there’s a difference in the lifespan of the connection to the database, which I’m guessing could impact performance.
You should use the second method. When you create a connection in pymongo you by default create a connection pool. See the documentation for more details see here. This is the correct way of doing things. The default max_pool_size is 10 so this will give you 10 connections to your mongod instance(s). If you did it the other way and created a pool per function call you would