NSManagedObject provides a method called initWithEntity:insertIntoManagedObjectContext:. You can use this…

Question

0

Asked: May 12, 20262026-05-12T11:43:09+00:00 2026-05-12T11:43:09+00:00

Let’s assume three models, standard joins: class Mailbox < ActiveRecord::Base has_many :addresses has_many :domains,

0

Let’s assume three models, standard joins:

class Mailbox < ActiveRecord::Base
  has_many :addresses
  has_many :domains, :through => :addresses
end

class Address < ActiveRecord::Base
  belongs_to :mailbox
  belongs_to :domain
end

class Domain < ActiveRecord::Base
  has_many :addresses
  has_many :mailboxes, :through => :addresses
end

Now obviously, if for any given mailbox you want to know in which domains it has addresses, you have two possible ways to go:

m = Mailbox.first
# either: SELECT DISTINCT domains.id, domains.name FROM "domains" INNER JOIN 
#         "addresses" ON "domains".id = "addresses".domain_id WHERE 
#         (("addresses".mailbox_id = 1))
m.domains.all(:select => 'DISTINCT domains.id, domains.name')
# or: SELECT domains.id, domains.name FROM "domains" INNER JOIN "addresses" ON
#     "domains".id = "addresses".domain_id WHERE (("addresses".mailbox_id = 1))
#      GROUP BY domains.id, domains.name
m.domains.all(:select => 'domains.id, domains.name', 
  :group => 'domains.id, domains.name')

The problem for me is that I don’t know which solution is better. When I don’t specify any other conditions, the PostgreSQL query planner favours solution number two (working as expected), but if I add conditions to the queries, it comes down to “Unique” vs. “Group”:

With “DISTINCT”:

 Unique  (cost=16.56..16.57 rows=1 width=150)
   ->  Sort  (cost=16.56..16.56 rows=1 width=150)
         Sort Key: domains.name, domains.id
         ->  Nested Loop  (cost=0.00..16.55 rows=1 width=150)
               ->  Index Scan using index_addresses_on_mailbox_id on addresses  (cost=0.00..8.27 rows=1 width=4)
                     Index Cond: (mailbox_id = 1)
               ->  Index Scan using domains_pkey on domains  (cost=0.00..8.27 rows=1 width=150)
                     Index Cond: (domains.id = addresses.domain_id)
                     Filter: (domains.active AND domains.selfmgmt)
(9 rows)

With “GROUP BY”:

Group  (cost=16.56..16.57 rows=1 width=150)
   ->  Sort  (cost=16.56..16.56 rows=1 width=150)
         Sort Key: domains.name, domains.id
         ->  Nested Loop  (cost=0.00..16.55 rows=1 width=150)
               ->  Index Scan using index_addresses_on_mailbox_id on addresses  (cost=0.00..8.27 rows=1 width=4)
                     Index Cond: (mailbox_id = 1)
               ->  Index Scan using domains_pkey on domains  (cost=0.00..8.27 rows=1 width=150)
                     Index Cond: (domains.id = addresses.domain_id)
                     Filter: (domains.active AND domains.selfmgmt)
(9 rows)

I’m really unsure how to determine the better way to retrieve these data. My instincts tell me to go with “GROUP BY”, but I was unable to find any documentation specific enough to solve this problem.

Should I use “:group” or “:select => ‘DISTINCT'”? Is that choice the same with other modern RDBMS like e.g. Oracle, DB2 or MySQL (I don’t have access to those, so I can’t perform tests)?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T11:43:09+00:00

Editorial Team

2026-05-12T11:43:09+00:00Added an answer on May 12, 2026 at 11:43 am

If you’re using Postgresql < 8.4 (which I guess you are, given the plans) – it’s usually better to use GROUP BY instead of DISTINCT as its plan is simply more efficient.

In 8.4 there is no difference as DISTINCT was “taught” to be able to use group operators as well.

0

Reply
Share
Share

- Report

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions