I have a table of edge like this:
-------------------------------
| id | arg1 | relation | arg2 |
-------------------------------
| 1 | 1 | 3 | 4 |
-------------------------------
| 2 | 2 | 6 | 5 |
-------------------------------
where arg1, relation and arg2 reference to the ids of objects in another object table:
--------------------
| id | object_name |
--------------------
| 1 | book |
--------------------
| 2 | pen |
--------------------
| 3 | on |
--------------------
| 4 | table |
--------------------
| 5 | bag |
--------------------
| 6 | in |
--------------------
What I want to do is that, considering performance issues (a very big table more than 50 million of entries) display the object_name for each edge entry rather than id such as:
---------------------------
| arg1 | relation | arg2 |
---------------------------
| book | on | table |
---------------------------
| pen | in | bag |
---------------------------
What is the best select query to do this? Also, I am open to suggestions for optimizing the query – adding more index on the tables etc…
EDIT: Based on the comments below:
1) @Craig Ringer: PostgreSQL version: 8.4.13 and only index is id for both tables.
2) @andrefsp: edge is almost x2 times bigger than object.
It won’t get cheaper than this:
And you don’t need more indexes. The one on
object.idis the only one needed for this query.But I seriously doubt that you want to retrieve 50 millions of rows at once, and in no particular order. You still didn’t give the full picture.