My goal is to read the data stored in a bio (assuming that it is for a write operation) and compute its md5 fingerprint. The data could be anything. I made several attempts. Here is my latest attempt:
void fingerprint(struct bio * bio, unsigned char * result)
{
char * place;
char * cpy;
int total;
struct buffer_head * bh;
struct scatterlist sg;
struct hash_desc desc = {
.tfm = crypto_alloc_hash("md5", 0, CRYPTO_ALG_ASYNC),
.flags = CRYPTO_TFM_REQ_MAY_SLEEP
};
total = 0;
cpy = kmalloc(sizeof(char) * bio->bi_io_vec[i].bv_len, GFP_KERNEL);
bh = (struct buffer_head *)bio->bi_io_vec[i].bv_page->private;
place = bh->b_data + bio->bi_io_vec[i].bv_offset;
DPRINTK("%u", bio->bi_io_vec[i].bv_offset);
DPRINTK("%u", place);
memcpy(cpy, place, bio->bi_io_vec[i].bv_len);
DPRINTK("%x", place);
DPRINTK("%x", cpy);
sg_init_one(&sg, (u8 *)cpy, bio->bi_io_vec[i].bv_len);
crypto_hash_init(&desc);
crypto_hash_update(&desc, &sg, bio->bi_io_vec[i].bv_len);
total += bio->bi_io_vec[i].bv_len;
DPRINTK("%u", bio->bi_vcnt);
DPRINTK("%u", total);
crypto_hash_final(&desc, result);
kfree(cpy);
}
The results of the hash are very similar if not exactly the same and they are always even numbers. I print out the results as long long’s and in hex. Why is this happening?
You need to test your code. Cut down your code to the MD5 minimum, and try a few test vectors.
“” -> d41d8cd98f00b204e9800998ecf8427e
“The quick brown fox jumps over the lazy dog” -> 9e107d9d372bb6826bd81d3542a419d6
“The quick brown fox jumps over the lazy dog.” -> e4d909c290d0fb1ca068ffaddf22cbd0
(Note the extra full stop in the third example.)
Once you are sure MD5 is working correctly, then gradually reintroduce the other parts of your code and recheck regularly.