Looking for a simple example of retrieving 500 items from dynamodb minimizing the number of queries. I know there’s a “multiget” function that would let me break this up into chunks of 50 queries, but not sure how to do this.
I’m starting with a list of 500 keys. I’m then thinking of writing a function that takes this list of keys, breaks it up into “chunks,” retrieves the values, stitches them back together, and returns a dict of 500 key-value pairs.
Or is there a better way to do this?
As a corollary, how would I “sort” the items afterwards?
Depending on you scheme, There are 2 ways of efficiently retrieving your 500 items.
1 Items are under the same
hash_key, using arange_keyquerymethod with thehash_keyrange_keysA-Z or Z-A2 Items are on "random" keys
BatchGetItemmethodOn the practical side, since you use Python, I highly recommend the Boto library for low-level access or dynamodb-mapper library for higher level access (Disclaimer: I am one of the core dev of dynamodb-mapper).
Sadly, neither of these library provides an easy way to wrap the batch_get operation. On the contrary, there is a generator for
scanand forquerywhich ‘pretends’ you get all in a single query.In order to get optimal results with the batch query, I recommend this workflow:
UnprocessedKeysas many times as neededQuick example
I assume you have created a table "MyTable" with a single
hash_keyEDIT:
I’ve added a
resubmit()function toBatchListin Boto develop branch. It greatly simplifies the worklow:BatchListsubmit()resubmit()as long as it does not return None.this should be available in next release.