Let’s assume there is a SQL Server 2008 table like below, that holds 10 million rows.
One of the fields is Id, since it’s identity it is from 1 to 10 million.
CREATE TABLE dbo.Stats
(
id INT IDENTITY(1,1) PRIMARY KEY,
field1 INT,
field2 INT,
...
)
Is there an efficient way by doing one select statement to get a subset of this data that satisfies the following requirements:
- contains a limited number of rows in the result set, i.e. 100, 200, etc.
- provides equal distribution of a certain column, not random, i.e. of column id
So, in our example, if we return 100 rows, the result set would look like this:
Row 1 - 100 000
Row 2 - 200 000
Row 3 - 300 000
...
Row 100 - 10 000 000
I want to avoid using cursor and storing this in a separate table.
Not sure how efficient it’s going to be, but thie following query will return every 100000th row (relative to ordering established by
id):Since it does not rely on actual
idvalues, this will work even if you have “holes” in the sequence ofidvalues.