I am trying to figure out on which data the crc32 field in the header of a RAR Recovery Record is based. I am trying to recreate a RAR volume based on a previous RAR volume and the extracted contents. I am up to the point where only 12 bytes differ from the correct/original volume.
The names are based on the unrar source code (arcread.cpp) or the RAR technote.
A RAR file consists of blocks. They have a header and a body:
[header][body]
The header contains metadata that describes the body. One of these blocks is HEAD_TYPE=0x74 File header (File in archive).
[header:a...FILE_CRC...z][body]
The field FILE_CRC (4 bytes) is calculated on all the data available in the [body], which is a stored or compressed file.
The block of a Recovery Record (HEAD_TYPE=0x7a subblock) is very similar to a file block, but it contains three extra fields in the header:
[header:a...FILE_CRC...z, "Protect+", rsc, dsc][body]
rsc: recovery sector count (4 bytes)
dsc: data sector count (8 bytes)
assert dsc*2 + rsc*512 == size([body])
You would think the FILE_CRC of this block is based on the data in the body, just like the file block, but this isn’t the case. (verified independently by an other person)
So my question is, what data is used to calculate this crc32?
Some things I have tried already:
- starting from Protect+ ect. followed by the body
- everything before the start of the RR subblock
- I have brute-forced all possible ranges on a small RAR file.
Instead of using the default seed (-0x1 or 0xFFFFFFFF):
an F was dropped (-0x10000000):
An email to the author was sent with the following response:
Like first thought, the FILE_CRC of the block is based on the data in the body. It looks as if there is a typo somewhere in the RAR code.
XADRARParser.m of TheUnarchiver2.7.1_src has the following commented code:
Almost 3 years later I found out that someone else had already found the solution to this problem earlier that year.