Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1084825
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T22:36:41+00:00 2026-05-16T22:36:41+00:00

I am using the App Engine Bulk loader (Python Runtime) to bulk upload entities

  • 0

I am using the App Engine Bulk loader (Python Runtime) to bulk upload entities to the data store. The data that i am uploading is stored in a proprietary format, so i have implemented by own connector (registerd it in bulkload_config.py) to convert it to the intermediate python dictionary.

import google.appengine.ext.bulkload import connector_interface
class MyCustomConnector(connector_interface.ConnectorInterface):
   ....
   #Overridden method
   def generate_import_record(self, filename, bulkload_state=None):
      ....
      yeild my_custom_dict

To convert this neutral python dictionary to a datastore Entity, i use a custom post import function that i have defined in my YAML.

def feature_post_import(input_dict, entity_instance, bulkload_state):
    ....
    return [all_entities_to_put]

Note: I am not using entity_instance, bulkload_state in my feature_post_import function. I am just creating new data store entities (based on my input_dict), and returning them.

Now, everything works great. However, the process of bulk loading data seems to take way too much time. For e.g. a GB (~ 1,000,000 entities) of data takes ~ 20 hours. How can I improve the performance of the bulk load process. Am i missing something?

Some of the parameters that i use with appcfg.py are (10 threads with a batch size of 10 entities per thread).

Linked a Google App Engine Python group post: http://groups.google.com/group/google-appengine-python/browse_thread/thread/4c8def071a86c840

Update:
To test the performance of the Bulk Load process, I loaded entities of a ‘Test’ Kind. Even though this entity has a very simple FloatProperty, it still took me the same amount of time to bulk load those entities.

I am still going to try to vary the bulk loader parameters, rps_limit, bandwidth_limit and http_limit, to see if i can get any more throughput.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T22:36:42+00:00Added an answer on May 16, 2026 at 10:36 pm

    There is parameter called rps_limit that determines the number of entities to upload per second. This was the major bottleneck. The default value for this is 20.

    Also increase the bandwidth_limit to something reasonable.

    I increased rps_limit to 500 and everything improved. I achieved 5.5 – 6 seconds per 1000 entities which is a major improvement from 50 seconds per 1000 entities.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using App Engine with Python. In order to store the images of my
I'm trying to upload some data to my App Engine datastore using the bulkuploader.
I am trying to implement a file upload solution using app engine and python.
I'm using App Engine, SDK 1.6.3 with Python 2.7. I've created a model like
I am using Google App Engine (python), I want my users to be able
I am using google app engine in python with a Jinja2 template engine. This
I'm trying to upload some records to my local data store using appcfg.py Only
I'm currently using App Engine with Python. My application looks like a massive multiplayer
I have an AppEngine app that I'm migrating to run in Django, using app-engine-patch
I'm writing a GAE app in Java and only using Python for the data

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.