Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6910499
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T08:49:20+00:00 2026-05-27T08:49:20+00:00

I have a problem. I want to create a search engine which is based

  • 0

I have a problem. I want to create a search engine which is based on IR systems. So, I have some files, I take the information I need and I store them in structures such as HashMaps, TreeMaps, ArrayLists e.t.c. Then, I want to write this information in files. So, I open at the same time 2 FileWriters. But I add in them more and more strings.

But this procedure takes too long. I don’t know why. When I put everything in the FileWriter, I close it through close().

Do you think that the problem is the reallocation every time I add new strings in my buffers?

Should I follow another strategy of opening the buffer, write, close it, and the next time open again to write at the end of the previous data? This will take less in time?

P.S.: The code is working exactly as i want for a small input file. The problem is when i use large and many input files.

public static void writeWordsandDfInFile(Map<String, Word> tmpMap) throws IOException
{
    Set tmpSet = tmpMap.entrySet();//Transform to Set for quick iteration  and printing
    Iterator tmpIt = tmpSet.iterator();
    String le3h=null;
    int bytesPostingFile;
    int bytesVocabularyFile;
    String str_out = null;
    String prev_str_out = null;
    String str_out2 = null;
    String str_tmp;
    String str_tmp2;
    String Tstrt;
    int prevctr=0;
    int flag=0;
    int i=0;
    int j;
    int k;
    int flag2;
    int flag3;
    int docId;
    //////////////////
    int SIZEDocumentsFileBytes;
    int prevInDocumentsFileBytes = 0;
    int newInDocumentsFileBytes = 0;
    int prwth_kataxwrhsh;
    int ctrPostingFileBytes=0;
    int prwthMonofora=0;



    giveWrdTakeBytePos=new HashMap<String,Integer>();//8a t dinw thn le3h kai 8a mou epistrefei thn 8esh se bytes mesa sto VocabularyFile.txt

    // Create file
    FileWriter fstream = new FileWriter(vocabularyFile.getPath());
    BufferedWriter out = new BufferedWriter(fstream);
    out.
    out.write("Le3h   Df   PosInPostingFile.txt\n\n");
    str_tmp=("Le3h   Df   PosInPostingFile.txt\n\n");

      // Create file
    FileWriter fstream2 = new FileWriter(postingFile.getPath());
    BufferedWriter out2 = new BufferedWriter(fstream2);
    out2.write("DocId  Tf  LineInFile       PosInDocumentsFile\n\n");
    str_tmp2=("DocId  Tf  LineInFile       PosInDocumentsFile\n\n");



    PostingFileBytes=new ArrayList<Integer>();//krataw ta bytes gia kaue eggrafh sto PostingFile



    flag=0;
    i=0;
    while(tmpIt.hasNext())
    {

         Map.Entry m = (Map.Entry) tmpIt.next();
         le3h=(String)m.getKey();

         Set s = tmpMap.get(le3h).getDocList().entrySet();
         Iterator it = s.iterator();
         Map.Entry mm =(Map.Entry)it.next();
         docId=(Integer)mm.getKey();


         Set ss=tmpMap.get(le3h).getDocList().keySet();

         Set stf=tmpMap.get(le3h).getTf().keySet();

         Iterator ssIt = ss.iterator();




         flag2=0;
         prwth_kataxwrhsh=0;
         while(ssIt.hasNext())
         {
            docId=(Integer)ssIt.next();

            out2.write(docId+"  "+tmpMap.get(le3h).getTf(docId));//grafw sto VocabularyFile.txt thn ka8e le3h kai to Df ths
            if(flag2==0)
            {
                str_out2=(docId+"  "+tmpMap.get(le3h).getTf(docId));
                flag2=1;
            }
            else
            {
                str_out2=(docId+"  "+tmpMap.get(le3h).getTf(docId));
            }



            flag3=0;
            Tstrt=null;
            for(k=0;k<tmpMap.get(le3h).ByteList.get(docId).size();k++)
            {
                out2.write("  "+tmpMap.get(le3h).ByteList.get(docId).get(k));

                if(flag3==0)
                {
                    Tstrt=("  "+tmpMap.get(le3h).ByteList.get(docId).get(k));
                    flag3=1;
                }
                else
                {
                    Tstrt=Tstrt+("  "+tmpMap.get(le3h).ByteList.get(docId).get(k));
                }

            }
            str_out2=str_out2+Tstrt;
            out2.write("  ->"+DocumentsFileBytes.get(docId)+"\n");
            str_out2=str_out2+("  ->"+DocumentsFileBytes.get(docId)+"\n");
            bytesPostingFile=str_out2.toString().length();

        ////////////////////////////////////////////////////////////////////////////////////////////////



            //................................................................................................................................
          SIZEDocumentsFileBytes=PostingFileBytes.size();

          if(prwthMonofora==0)
          {
            prevInDocumentsFileBytes=str_tmp2.toString().length();

            prwthMonofora=1;

            PostingFileBytes.add(prevInDocumentsFileBytes);
            ctrPostingFileBytes=0;//dld. parxei kataxwrish sthn 8esh 0 tou posting file
            newInDocumentsFileBytes=prevInDocumentsFileBytes + bytesPostingFile;
            //System.out.println("EPOMENH: "+newInDocumentsFileBytes);
          }
          else
          {
              if(prwth_kataxwrhsh==0)//gia ka8e le3h mono thn prwth fora kai as exei DF>1
              {
                    //System.out.println("Prohg. Timh:"+prevInDocumentsFileBytes);
                    prevInDocumentsFileBytes=newInDocumentsFileBytes;//apo prin
                    //System.out.println("BAZW: "+prevInDocumentsFileBytes);
                    PostingFileBytes.add(prevInDocumentsFileBytes);
                    ctrPostingFileBytes++;
                    prwth_kataxwrhsh=1;
              }
              else
              {
                prevInDocumentsFileBytes=newInDocumentsFileBytes;
              }
              newInDocumentsFileBytes=prevInDocumentsFileBytes + bytesPostingFile;
              //System.out.println("EPOMENH: "+newInDocumentsFileBytes);
          }


         }


         //------------------------------------------------------------------------------------------------------------------


         int ptr=ctrPostingFileBytes;

         out.write(le3h+"  "+tmpMap.get(le3h).getDf());//grafw sto VocabularyFile.txt thn ka8e le3h kai to Df ths

         out.write("  ->"+PostingFileBytes.get(ptr)+"\n");


           if(flag==0)//thn prwth fora
            {
               str_out=(le3h+"  "+tmpMap.get(le3h).getDf()+"  ->"+PostingFileBytes.get(ptr)+"\n");
               giveWrdTakeBytePos.put(le3h, str_tmp.toString().length());
               flag=1;
               prev_str_out=str_tmp+str_out;
            }
            else
            {
                giveWrdTakeBytePos.put(le3h, prev_str_out.toString().length());

                str_out=str_out+(le3h+"  "+tmpMap.get(le3h).getDf()+"  ->"+PostingFileBytes.get(ptr)+"\n");
                prev_str_out=prev_str_out+(le3h+"  "+tmpMap.get(le3h).getDf()+"  ->"+PostingFileBytes.get(ptr)+"\n");
            }

      //................................................................................................................................


    }

    //Close the output stream
    out.close();

    //Close the output stream
    out2.close();

}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T08:49:21+00:00Added an answer on May 27, 2026 at 8:49 am

    From what I can see you never append to a file but always write it new. But from what you wrote above (without having read the whole code) you want to append Data to the file.

    new FileWriter("path", true);
    

    Does that help you?

    Another suggestion drop the File write and use this:

    public static void foo()
    {
        // ...
    
        byte[] fifeMBByteAryOne = new byte[5242880];
        ByteArrayStream bStream = new ByteArrayStream(fifeMBByteAryOne);
        BufferedWriter out = new BufferedWriter(new OutputStreamWriter(bStream));
        byte[] fifeMBByteAryTwo = new byte[5242880];
        ByteArrayStream bStream2 = new ByteArrayStream(fifeMBByteAryTwo);
        BufferedWriter out2 = new BufferedWriter(new OutputStreamWriter(bStream2));
    
        // ...
    
    }
    
    private static class ByteArrayStream extends OutputStream {
        int index = 0;
        byte[] container;
    
        public ByteArrayStream(byte[] container) {
            this.container = container;
        }
    
        @Override
        public void write(int b) throws IOException {
            container[index++] = (byte)b;
        }
    
    }
    

    Then let it run again and see how long it takes. If it is as slow as before, the File is not your problem.


    After having read through the code, I’m fairly sure that you are a student or beginner in java programming, that’s fine, but you should have stated that in your question. Also it causes people to give you advices rather than direct solutions to your problem.

    There are a lot of things you could improve.
    The first and from my point of view very important: You coding style needs improvement. Really! There are standards on how you write variables (starting with a small letter) methods and so on. Use them. You use far more variables than you need and you define them all at the beginning of the method.
    You use Sets and Iterators when you don’t need them (e.G.

    Set s = currentWord.getDocList().entrySet();
    Iterator it = s.iterator();
    Map.Entry mm = (Map.Entry) it.next();
    docId = (Integer) mm.getKey();
    

    then you never use the value of docId, but of course this action here takes time.

    Rewrite that method and this time understand what you do and do only what you need, when you need it, the way it is now I would not allow anyone in my company to use it for a customer.

    Second: when you post code to the internet be sure to post code that compiles directly. I needed 15 Minutes to have that code compiling. There are very few people around that have that much patience.

    Third: For Situations were you write less than ~ 2MB of text its usually useful to use a StringBuilder to construct the whole text and to write it as one thing in the end. That makes debugging easier.

    Fourth: Before you post code on the internet be sure to have thought about the problem yourself and have tested to solve it. In this case you could use Dates to do so, just write a text like:

    // at the beginning of a loop
    long startedAt = new Date().getTime();
    // somewhen within the loop:
    System.out.println("in situation X " + (new Date().getTime()-startedAt);
    

    That way you can see what step takes how long and can then start to optimize that area.

    Fifth: If after Fourth there is still a problem be sure to post a short piece of code that demonstrates clearly your problem. Don’t rely on the other users to understand your problem, show it to them. Make it easy for them by using self explaining variable-, method-, classnames in the language you are asking. Same goes for your comments.

    Sixth: The reason you should do all this is to give you the ability to solve your problems yourself and to ask people with extended skills only the problems that are worth their time.

    Good luck

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have problem with return statment >.< I want to store all magazine names
hi i want to create user search fully driven with sphinx. problem is that
I want to create a set of objects which inherit from some Android View
I want to search through files and create images from jpg files. There's the
I have problem with dynamically created image (JavaScript). I want to change the innerHTML
I have a problem and I want to implement the MVC pattern to my
I have problem with fancybox. I want to write a function that will run
I have problem with HTTP headers, they're encoded in ASCII and I want to
with my RCP program I have the problem that I want to have more
The Problem I want to press a key when I have a line highlighted

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.