I’ve got an index on columns a VARCHAR(255), b INT in an InnoDB table. Given two a,b pairs, can I use the MySQL index to determine if the pairs are the same from a c program (i.e. without using a strcmp and numerical comparison)?
- Where is a MySQL InnoDB index stored in the file system?
- Can it be read and used from a separate program? What is the format?
- How can I use an index to determine if two keys are the same?
Note: An answer to this question should either a) provide a method for accessing a MySQL index in order to accomplish this task or b) explain why the MySQL index cannot practically be accessed/used in this way. A platform-specific answer is fine, and I’m on Red Hat 5.8.
Below is the previous version of this question, which provides more context but seems to distract from the actual question. I understand that there are other ways to accomplish this example within MySQL, and I provide two. This is not a question about optimization, but rather of factoring out a piece of complexity that exists across many different dynamically generated queries.
I could accomplish my query using a subselect with a subgrouping, e.g.
SELECT c, AVG(max_val)
FROM (
SELECT c, MAX(val) AS max_val
FROM table
GROUP BY a, b) AS t
GROUP BY c
But I’ve written a UDF that allows me to do it with a single select, e.g.
SELECT b, MY_UDF(a, b, val)
FROM table
GROUP by c
The key here is that I pass the fields a and b to the UDF, and I manually manage a,b subgroups in each group. Column a is a varchar, so this involves a call to strncmp to check for matches, but it’s reasonably fast.
However, I have an index my_key (a ASC, b ASC). Instead of checking for matches on a and b manually, can I just access and use the MySQL index? That is, can I get the index value in my_key for a given row or a,b pair in c (inside the UDF)? And if so, would the index value be guaranteed to be unique for any value a,b?
I would like to call MY_UDF(a, b, val) and then look up the mysql index value (a,b) in c from the UDF.
If you just want to access an index outside of MySQL, you will have to use the API for one of the MySQL storage engines. The default engine is InnoDB. See overview here: InnoDB Internals. This describes (at a very high level) both the data layout on disk and the APIs to access it. A more detailed description is here: Embedded InnoDB.
However, rather than write your own program that uses InnoDB APIs directly (which is a lot of work), you might use one of the projects that have already done that work:
HandlerSocket: gives NoSQL access to InnoDB tables, runs in a UDF. See a very informative blog post from the developer. The goal of HandlerSocket is to provide a NoSQL interface exposed as a network daemon, but you could use the same technique (and much of the same code) to provide something that would be used by a query withing MySQL.
memcached InnoDB plugin. gives memcached style access to InnoDB tables.
HailDB: gives NoSQL access to InnoDB tables, runs on top of Embedded InnoDB. see conference presentation. EDIT: HailDB probably won’t work running side-by-side with MySQL.
I believe any of these can run side-by-side with MySQL (using the same tables live), and can be used from C, so they do meet your requirements.
If you can use/migrate to MySQL Cluster, see also NDB API, a direct API, and ndbmemcache, a way to access MySQL Cluster using memcache API.
This is hard to answer without knowing why you are trying to do this, because the implications of different approaches are very different.