I use SQLite3 and have a table called blobs that stores content and *hash_value*.
Here is the schema:
CREATE TABLE "blobs" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
"content" blob,
"hash_value" text,
"created_at" datetime NOT NULL,
"updated_at" datetime NOT NULL
);
Now I inserted some data. that looks like this:
1|--- foo
...
|34dc86f45b3dc92b352fd45f525192c0|2012-04-09 17:02:54.219504|2012-04-09 17:02:54.219504
And I tried the following two queries:
select * from blobs where hash_value = '34dc86f45b3dc92b352fd45f525192c0';select * from blobs where hash_value LIKE '34dc86f45b3dc92b352fd45f525192c0';
The first does not work, but the second one does. I do not understand why the = operator does not work.
I tried to break this down to a simple example where my hash is just 'abc' and = works. I mean this string is hardly too long.
EDIT
Ok I actually narrowed it down to this:
- I am using Ruby to generate the hash like this
Digest::MD5.hexdigest("foobar") - This generates a string like this:
'3858f62230ac3c915f300c664312c63f' - My test look somewhat like this:
b = Blob.new(...);b.save!;Blob.find_by_hash(b.hash) - And the find_hash is
Blob.find(:all, :conditions => ["hash_value = ?", hash_value]) - It works if I set the hash manually to
'3858f62230ac3c915f300c664312c63f'(hardcoded string). -
But if this string is generated I get the following error:
Failure/Error: Blob.find_by_hash(b.hash_value)[0].load.should == txt ArgumentError: wrong number of arguments (0 for 1)
And I cannot query SQLite3 as stated above.
Solution
The solution is:
- Instead of using Digest::MD5.hexdigest(“foobar”) use Digest::MD5.base64digest(“foobar”)
I do not know why sqlite3 has problems with hexdigest but there definitively is something fishy about this.
The difference between the two is encoding:
I don’t think there’s a particular reason why hexdigest has the 8bit encoding (which effectively means ‘this is raw data’, but that’s what ruby seems to do. When the ruby sqlite3 driver sees something with the ascii-8bit encoding it binds the value to the query as a blob, rather than as text. This in turn affects how sqlite3 does the comparison (although I don’t understand exactly how).
See also this question