Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 846123
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T06:33:51+00:00 2026-05-15T06:33:51+00:00

I have a 1.2GB file that contains a one line string. What I need

  • 0

I have a 1.2GB file that contains a one line string.
What I need is to search the entire file to find the position of an another string (currently I have a list of strings to search).
The way what I’m doing it now is opening the big file and move a pointer throught 4Kb blocks, then moving the pointer X positions back in the file and get 4Kb more.

My problem is that a bigger string to search, a bigger time he take to got it.

Can you give me some ideas to optimize the script to get better search times?

this is my implementation:

function busca($inici){
        $limit = 4096;

        $big_one    = fopen('big_one.txt','r');
        $options    = fopen('options.txt','r');

        while(!feof($options)){
            $search = trim(fgets($options));
            $retro  = strlen($search);//maybe setting this position absolute? (like 12 or 15)

            $punter = 0;
            while(!feof($big_one)){
                $ara = fgets($big_one,$limit);

                $pos = strpos($ara,$search);
                $ok_pos = $pos + $punter;

                if($pos !== false){
                    echo "$pos - $punter - $search : $ok_pos <br>";
                    break;
                }

                $punter += $limit - $retro;
                fseek($big_one,$punter);
            }
            fseek($big_one,0);
        }
    }

Thanks in advance!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T06:33:52+00:00Added an answer on May 15, 2026 at 6:33 am

    Why don’t use exec + grep -b?

    exec('grep "new" ext-all-debug.js -b', $result);
    // here we have looked for "new" substring entries in the extjs debug src file
    var_dump($result);
    

    sample result:

    array(1142) {
        [0]=>  string(97) "3398: * insert new elements. Revisiting the example above, we could utilize templating this time:"
        [1]=>  string(54) "3910:var tpl = new Ext.DomHelper.createTemplate(html);"
        ...
    }
    

    Each item consists of string offset in bytes from the start of file and the line itself, separated with colon.
    So after this you have to look inside the particular line and append the position to the line offset. I.e.:

    [0]=>  string(97) "3398: * insert new elements. Revisiting the example above, we could utilize templating this time:"
    

    this means that “new” occurrence found at 3408th byte (3398 is the line position and 10 is the position of “new” inside this line)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Have a SomeLib.pro file that contains: CONFIG += debug TEMPLATE = lib TARGET =
I need to access a file larger than 2gb using C. During one run
I have a 2GB text file on my linux box that I'm trying to
I have a very large data file (2GB-3GB). I need to parse some data
What I need to do is that I have few files (txt) about 2GB
Say I want to have a server that can accept 2GB file over network,
We have a server that is serving one html file. Right now the server
I have a 2GB big text file, it has 5 columns delimited by tab.
It's a known fact that Windows applications usually have 2Gb of private address space
Have a matrix report now that has Position, Hours and Wages for a location

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.