I am having a issue finding a fast way of joining the tables looking

Question

0

Asked: May 27, 20262026-05-27T01:14:40+00:00 2026-05-27T01:14:40+00:00

I am having a issue finding a fast way of joining the tables looking

0

I am having a issue finding a fast way of joining the tables looking like that:

mysql> explain geo_ip;
+--------------+------------------+------+-----+---------+-------+
| Field        | Type             | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+-------+
| ip_start     | varchar(32)      | NO   |     | ""      |       |
| ip_end       | varchar(32)      | NO   |     | ""      |       |
| ip_num_start | int(64) unsigned | NO   | PRI | 0       |       |
| ip_num_end   | int(64) unsigned | NO   |     | 0       |       |
| country_code | varchar(3)       | NO   |     | ""      |       |
| country_name | varchar(64)      | NO   |     | ""      |       |
| ip_poly      | geometry         | NO   | MUL | NULL    |       |
+--------------+------------------+------+-----+---------+-------+


mysql> explain entity_ip;
+------------+---------------------+------+-----+---------+-------+
| Field      | Type                | Null | Key | Default | Extra |
+------------+---------------------+------+-----+---------+-------+
| entity_id  | int(64) unsigned    | NO   | PRI | NULL    |       |
| ip_1       | tinyint(3) unsigned | NO   |     | NULL    |       |
| ip_2       | tinyint(3) unsigned | NO   |     | NULL    |       |
| ip_3       | tinyint(3) unsigned | NO   |     | NULL    |       |
| ip_4       | tinyint(3) unsigned | NO   |     | NULL    |       |
| ip_num     | int(64) unsigned    | NO   |     | 0       |       |
| ip_poly    | geometry            | NO   | MUL | NULL    |       |
+------------+---------------------+------+-----+---------+-------+

Please note that I am not interested in finding the needed rows in geo_ip by only ONE IP address at once, I need a entity_ip LEFT JOIN geo_ip (or similar/analogue way).

This is what I have for now (using polygons as advised on http://jcole.us/blog/archives/2007/11/24/on-efficiently-geo-referencing-ips-with-maxmind-geoip-and-mysql-gis/):

mysql> EXPLAIN SELECT li.*, gi.country_code FROM entity_ip AS li
-> LEFT JOIN geo_ip AS gi ON
-> MBRCONTAINS(gi.`ip_poly`, li.`ip_poly`);

+----+-------------+-------+------+---------------+------+---------+------+--------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------+
|  1 | SIMPLE      | li    | ALL  | NULL          | NULL | NULL    | NULL |   2470 |       |
|  1 | SIMPLE      | gi    | ALL  | ip_poly_index | NULL | NULL    | NULL | 155183 |       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------+

mysql> SELECT li.*, gi.country_code FROM entity AS li LEFT JOIN geo_ip AS gi ON MBRCONTAINS(gi.`ip_poly`, li.`ip_poly`) limit  0, 20;
20 rows in set (2.22 sec)

No polygons

mysql> explain SELECT li.*, gi.country_code FROM entity_ip AS li LEFT JOIN geo_ip AS gi ON li.`ip_num` >= gi.`ip_num_start` AND li.`ip_num` <= gi.`ip_num_end` LIMIT 0,20;
+----+-------------+-------+------+---------------------------+------+---------+------+--------+-------+
| id | select_type | table | type | possible_keys             | key  | key_len | ref  | rows   | Extra |
+----+-------------+-------+------+---------------------------+------+---------+------+--------+-------+
|  1 | SIMPLE      | li    | ALL  | NULL                      | NULL | NULL    | NULL |   2470 |       |
|  1 | SIMPLE      | gi    | ALL  | PRIMARY,geo_ip,geo_ip_end | NULL | NULL    | NULL | 155183 |       |
+----+-------------+-------+------+---------------------------+------+---------+------+--------+-------+

mysql> SELECT li.*, gi.country_code FROM entity_ip AS li LEFT JOIN geo_ip AS gi ON li.ip_num BETWEEN gi.ip_num_start AND gi.ip_num_end limit  0, 20;
20 rows in set (2.00 sec)

(On higher number of rows in the search – there is no difference)

Currently I cannot get any faster performance from these queries as 0.1 seconds per IP is way too slow for me.

Is there any way to make it faster?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T01:14:41+00:00

This approach has some scalability issues (should you choose to move to, say, city-specific geoip data), but for the given size of data, it will provide considerable optimization.

The problem you are facing is effectively that MySQL does not optimize range-based queries very well. Ideally you want to do an exact (“=”) look-up on an index rather than “greater than”, so we’ll need to build an index like that from the data you have available. This way MySQL will have much fewer rows to evaluate while looking for a match.

To do this, I suggest that you create a look-up table that indexes the geolocation table based on the first octet (=1 from 1.2.3.4) of the IP addresses. The idea is that for each look-up you have to do, you can ignore all geolocation IPs which do not begin with the same octet than the IP you are looking for.

CREATE TABLE `ip_geolocation_lookup` (
  `first_octet` int(10) unsigned NOT NULL DEFAULT '0',
  `ip_numeric_start` int(10) unsigned NOT NULL DEFAULT '0',
  `ip_numeric_end` int(10) unsigned NOT NULL DEFAULT '0',
  KEY `first_octet` (`first_octet`,`ip_numeric_start`,`ip_numeric_end`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Next, we need to take the data available in your geolocation table and produce data that covers all (first) octets the geolocation row covers: If you have an entry with ip_start = '5.3.0.0' and ip_end = '8.16.0.0', the lookup table will need rows for octets 5, 6, 7, and 8. So…

ip_geolocation
|ip_start       |ip_end          |ip_numeric_start|ip_numeric_end|
|72.255.119.248 |74.3.127.255    |1224701944      |1241743359    |

Should convert to:

ip_geolocation_lookup
|first_octet|ip_numeric_start|ip_numeric_end|
|72         |1224701944      |1241743359    |
|73         |1224701944      |1241743359    |
|74         |1224701944      |1241743359    |

Since someone here requested for a native MySQL solution, here’s a stored procedure that will generate that data for you:

DROP PROCEDURE IF EXISTS recalculate_ip_geolocation_lookup;

CREATE PROCEDURE recalculate_ip_geolocation_lookup()
BEGIN
    DECLARE i INT DEFAULT 0;

    DELETE FROM ip_geolocation_lookup;

    WHILE i < 256 DO
       INSERT INTO ip_geolocation_lookup (first_octet, ip_numeric_start, ip_numeric_end) 
                SELECT  i, ip_numeric_start, ip_numeric_end FROM ip_geolocation WHERE 
                ( ip_numeric_start & 0xFF000000 ) >> 24 <= i AND 
                ( ip_numeric_end & 0xFF000000 ) >> 24 >= i;

       SET i = i + 1;
    END WHILE;
END;

And then you will need to populate the table by calling that stored procedure:

CALL recalculate_ip_geolocation_lookup();

At this point you may delete the procedure you just created — it is no longer needed, unless you want to recalculate the look-up table.

After the look-up table is in place, all you have to do is integrate it into your queries and make sure you’re querying by the first octet. Your query to the look-up table will satisfy two conditions:

Find all rows which match the first octet of your IP address
Of that subset: Find the row which has the the range that matches your IP address

Because the step two is carried out on a subset of data, it is considerably faster than doing the range tests on the entire data. This is the key to this optimization strategy.

There are various ways for figuring out what the first octet of an IP address is; I used ( r.ip_numeric & 0xFF000000 ) >> 24 since my source IPs are in numeric form:

SELECT 
    r.*, 
    g.country_code
FROM 
    ip_geolocation g,
    ip_geolocation_lookup l,
    ip_random r
WHERE 
    l.first_octet = ( r.ip_numeric & 0xFF000000 ) >> 24 AND 
    l.ip_numeric_start <= r.ip_numeric AND      
    l.ip_numeric_end >= r.ip_numeric AND 
    g.ip_numeric_start = l.ip_numeric_start;

Now, admittedly I did get a little lazy in the end: You could easily get rid of ip_geolocation table altogether if you made the ip_geolocation_lookup table also contain the country data. I’m guessing dropping one table from this query would make it a bit faster.

And, finally, here are the two other tables I used in this response for reference, since they differ from your tables. I’m certain they are compatible, though.

# This table contains the original geolocation data

CREATE TABLE `ip_geolocation` (
  `ip_start` varchar(16) NOT NULL DEFAULT '',
  `ip_end` varchar(16) NOT NULL DEFAULT '',
  `ip_numeric_start` int(10) unsigned NOT NULL DEFAULT '0',
  `ip_numeric_end` int(10) unsigned NOT NULL DEFAULT '0',
  `country_code` varchar(3) NOT NULL DEFAULT '',
  `country_name` varchar(64) NOT NULL DEFAULT '',
  PRIMARY KEY (`ip_numeric_start`),
  KEY `country_code` (`country_code`),
  KEY `ip_start` (`ip_start`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


# This table simply holds random IP data that can be used for testing

CREATE TABLE `ip_random` (
  `ip` varchar(16) NOT NULL DEFAULT '',
  `ip_numeric` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`ip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am having a issue finding a fast way of joining the tables looking

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply