I’m currently using the following two methods in my class to get the job done:
function xseek($h,$pos){
rewind($h);
if($pos>0)
fread($h,$pos);
}
function find($str){
return $this->startingindex($this->name,$str);
}
function startingindex($a,$b){
$lim = 1 + filesize($a) - strlen($b)/2;
$h = fopen($a,"rb");
rewind($h);
for($i=0;$i<$lim;$i++){
$this->xseek($h,$i);
if($b==strtoupper(bin2hex(fread($h,strlen($b)/2)))){
fclose($h);
return $i;
}
}
fclose($h);
return -1;
}
I realize this is quite inefficient, especially for PHP, but I’m not allowed any other language on my hosting plan.
I ran a couple tests, and when the hex string is towards the beginning of the file, it runs quickly and returns the offset. When the hex string isn’t found, however, the page hangs for a while. This kills me inside because last time I tested with PHP and had hanging pages, my webhost shut my site down for 24 hours due to too much cpu time.
Is there a better way to accomplish this (finding a hex string’s offset in a file)? Is there certain aspects of this that could be improved to speed up execution?
I would read the entire contents of the file into one hex string and use strrpos, but I was getting errors about maximum memory being exceeded. Would this be a better method if I chopped the file up and searched large pieces with strrpos?
edit:
To specify, I’m dealing with a settings file for a game. The settings and their values are in a block where there is a 32-bit int before the setting, then the setting, a 32-bit int before the value, and then the value. Both ints represent the lengths of the following strings. For example, if the setting was “test” and the value was “0”, it would look like (in hex): 00000004746573740000000130. Now that you mention it, this does seem like a bad way to go about it. What would you recommend?
edit 2:
I tried a file that was below the maximum memory I’m allowed and tried strrpos, but it was very much slower than the way I’ve been trying.
edit 3: in reply to Charles:
What’s unknown is the length of the settings block and where it starts. What I do know is what the first and last settings USUALLY are. I’ve been using these searching methods to find the location of the first and last setting and determine the length of the settings block. I also know where the parent block starts. The settings block is generally no more than 50 bytes into its parent, so I could start the search for the first setting there and limit how far it will search. The problem is that I also need to find the last setting. The length of the settings block is variable and could be any length. I could read the file the way I assume the game does, by reading the size of the setting, reading the setting, reading the size of the value, reading the value, etc. until I reached a byte with value -1, or FF in hex. Would a combination of limiting the search for the first setting and reading the settings properly make this much more efficient?
You have a lot of garbage code. For example, this code is doing nearly nothing:
because it reads everytime from the begining of the file. Furthemore, why do you need to read something if you are not returning it? May be you looke for
fseek()?If you need to find a hex string in binary file, may be better to use something like this: http://pastebin.com/fpDBdsvV (tell me if there some bugs/problems).
But, if you are parsing game’s settings file, I’d advise you to use
fseek(),fread()andunpack()to seek to a place of where setting is, read portion of bytes and unpack it to PHP’s variable types.