I have a postgre server that is located in the network and I am

Question

0

Asked: June 7, 20262026-06-07T19:04:54+00:00 2026-06-07T19:04:54+00:00

I have a postgre server that is located in the network and I am

0

I have a postgre server that is located in the network and I am working with the database.
I need to go over large amount of records (1mil+) and each selection takes time.

This is my current method:

DataSet ds = new psqlWork().getDataSet("SELECT * FROM z_sitemap_links"); 
DataTable dt = ds.Tables[0]; 
Parallel.ForEach(dt.AsEnumerable(), dr => 
{ 
    new Sitemap().runSitemap(dr[1].ToString(), counter); 
    counter++; 
});

but when the DB size will grow, this method (in my opinion) will not be as effective. Could you suggest a better way of doing this? Maybe pulling the data to process in chunks; although I don’t know how to manage this right now.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T19:04:56+00:00

Points for optimization:

Create named types, and use ADO.NET to read into named types, instead of using DataSet and DataTable, that will reduce some of the memory footprint.
Only pull the records you actually need to work with (you don’t often need to bring in over a million records, but we don’t know your business logic)

Questions to clarify your original post:

Do you have reasons why this won’t scale in the future?
How are you processing it that you’re taking advantage of the Parallel.ForEach? Provided that underlying system has the capacity for it, you will probably be just fine with the approach you have now. Consider also, that you should probably profile the actual performance instead of just guessing what’s going to happen.

DataSet ds = new psqlWork().getDataSet(@"
  SELECT * FROM z_sitemap_links 
  order by timestamp asc /*always order when skipping records so you get the same skips */
  LIMIT 100000 /* using these two with variables you could skip so many records /*
  OFFSET 100000 /* depending on what you're aiming for */
"); 
DataTable dt = ds.Tables[0]; 
Parallel.ForEach(dt.AsEnumerable(), dr => 
{ 
    new Sitemap().runSitemap(dr[1].ToString(), counter); 
    counter++; 
});

And, if you can utilize something like this: row_number() OVER (ORDER BY col1) AS i then you could skip the counter, as that would be provided for you as you select the rows coming back, but my postgres knowledge doesn’t tell me if that will be 1..100000 everytime from the above code, or if it will be what you want, but the guys over at Database Administrators would know for sure. This means your code would become:

Parallel.ForEach(recordList, record => 
{ 
    new Sitemap().runSitemap(record.FieldYouNeed, record.RowNumberFromDatabase);
});

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a postgre server that is located in the network and I am

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply