Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4254558
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T05:06:01+00:00 2026-05-21T05:06:01+00:00

I am using an API (Let’s pretend its facebook) to gather data between two

  • 0

I am using an API (Let’s pretend its facebook) to gather data between two given dates. Because of API restrictions (like most) I can only grab so many at a time, and therefor have to page my way through the results.

Here is my issue/question though.. Is it better to

  1. get fewer results back, and make more calls to the api
  2. get more results back, and fewer calls to the api

I am running a 4GB instance of a cloud server..

The data I’m looking at is in XML format, and contains about 20k entries. Each entry contains probably another 20 tags within it. Once completely pulled down the data ends up being about 10MB.. my problem is that when my server is hitting the api, gathering this information the CPU and Memory spike to nearly 100%. I’ve tried retrieving 500 at a time, 1000 at a time, 5000 at a time.. is this something where I need to gather 20 at a time.. or is there something else I should look at?

I’m not sure what else to provide, if there is something I can provide just let me know

Updates based on answers

  • I host with Storm on Demand, which runs perfectly for us and seems to be great hardware – https://www.stormondemand.com/cloud-server/
  • I use HPricot to parse the XML (which could probably be optimized, I’m no expert here)
  • I do need all of the data, this service doesn’t offer an export, only API.

EDIT [to help people stumbling on this later]
I switched from Hpricot to Nokogiri, MUCH faster.
Also, I was building an XML file in memory, apparently that is extremely intense, and was a very time consuming task. I’ve cut this operation down from about 10 minutes, to just over 1 minute by fixing these two things.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T05:06:01+00:00Added an answer on May 21, 2026 at 5:06 am

    Here’s a list of things to look at:

    • optimize your code. try profiling your code and see if you can improve it. Mast likely using a better parser (DOM vs SAX) is possible.
    • get a better hardware/hosting. 4GB is just memory. Most likely you are on a shared hosting/vm and CPU limited
    • offload some CPU/memory heavy operations to a faster service/application, like XML processing, data analysis, file io can be done in C/C++
    • in a proper cloud environment you should be able to spawn more VMs and adjust your jobs/load accordingly. That will cost more tough and require some kind of job manager.
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm currently successfully using Win32 API's SendMessage function to send text between two threads
So, I am using Google Contacts API to let users import their contacts from
I am using Google Books API to let a user search for a particular
I have created Facebook Test Users using the API. Facebook doesn't allow updates to
I am using facebook C# SDK in my application and using Graph API. Everything
We have a problem creating photo album in facebook using android and graph API.
I am trying to retrieve some data from an API using CURL request. Is
I have a website in which i use the Facebook API to let users
I'm creating a small service using api-libraries, such as Twitter. Is it possible to
I'm trying to send post using API feed. I set fields: message , link

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.