Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6751967
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T12:59:23+00:00 2026-05-26T12:59:23+00:00

Using Java, I have a class which retrieves a webpage as a byte array.

  • 0

Using Java, I have a class which retrieves a webpage as a byte array. I then need to strip out some content if it exists. (The application monitors web pages for changes, but needs to remove session Ids from the html which are created by php, and would mean changes were detected each visit to the page).

Some of the resulting byte arrays could be 10s of 1000s bytes long. They’re not stored like this – a 16 byte MD5 of the page is stored. However, it is the original full size byte array which needs to be processed.

(UPDATE – the code does not work. See comment from A.H. below)
A test showing my code:

public void testSessionIDGetsRemovedFromData() throws IOException
    {

        byte[] forumContent = "<li class=\"icon-logout\"><a href=\"./ucp.php?mode=logout&amp;sid=3a4043284674572e35881e022c68fcd8\" title=\"Logout [ barry ]\" accesskey=\"x\">Logout [ barry ]</a></li>".getBytes();

        byte[] sidPattern = "&amp;sid=".getBytes();
        int sidIndex = ArrayCleaner.getPatternIndex(forumContent, sidPattern);
        assertEquals(54, sidIndex);

        // start of cleaning code
        ArrayList<Byte> forumContentList = new ArrayList<Byte>();
        forumContentList.addAll(forumContent);
        forumContentList.removeAll(Arrays.asList(sidPattern));

        byte[] forumContentCleaned = new byte[forumContentList.size()];
        for (int i = 0; i < forumContentCleaned.length; i++)
        {
            forumContentCleaned[i] = (byte)forumContentList.get(i);
        }
        //end of cleaning code

        sidIndex = ArrayCleaner.getPatternIndex(forumContentCleaned, sidPattern);
        assertEquals(-1, sidIndex);
    }

This all works fine, but I’m worried about the efficiency of the cleaning section. I had hoped to operate solely on arrays, but the ArrayList has nice built in functions to removed a collection from the ArrayList, etc, which is just what I need. So I have had to create an ArrayList of Byte, as I can’t have an ArrayList of the primitive byte (can anyone tell me why?), convert the pattern to remove to another ArrayList (I suppose this could be an ArrayList all along) to pass to removeAll(). I then need to create another byte[] and cast each element of the ArrayList of Bytes to a byte and add it to the byte[].

Is there a more efficient way of doing all this?
Can it be performed using arrays?

UPDATE
This is the same functionality using strings:

    public void testSessionIDGetsRemovedFromDataUsingStrings() throws IOException
{       
    String forumContent = "<li class=\"icon-logout\"><a href=\"./ucp.php?mode=logout&amp;sid=3a4043284674572e35881e022c68fcd8\" title=\"Logout [ barry ]\" accesskey=\"x\">Logout [ barry ]</a></li>";
    String sidPattern = "&amp;sid=";

    int sidIndex = forumContent.indexOf(sidPattern);
    assertEquals(54, sidIndex);

    forumContent = forumContent.replaceAll(sidPattern, "");
    sidIndex = forumContent.indexOf(sidPattern);
    assertEquals(-1, sidIndex);
}

Is this as efficient as the array/arrayList method?

Thanks,
Barry

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T12:59:23+00:00Added an answer on May 26, 2026 at 12:59 pm

    This all works fine, I’m worried about the efficiency of the cleaning section…

    Really? Did you inspect the resulting “string”? On my machine the data in forumContentCleaned still contains the &amp;sid=... data.

    That’s because

    forumContentList.removeAll(Arrays.asList(sidPattern));
    

    tries to remove a List<byte[]> from a List<Byte>. This will do nothing. And even if you replace the argument of removeAll with a real List<Byte> containing the bytes of "&amp;sid=", then you will remove ALL occurences of each a, each m, each p and so forth. The resulting data will look like this:

    <l cl"con-logout">< href"./uc.h?oelogout34043284674572e35881e022c68fc8" ttle....
    

    Well, strictly speaking, the &amp;sid= part is gone, but I’m quite sure this is not what you wanted.

    Therefore take a step back and think: You are doing string manipulation here, so use a StringBuilder, feed it with the String(forumContent) and do your manipulation there.

    Edit

    Looking at the given example input string, I guess, that also the value of sid should be removed, not only the key. This code should do it efficiently without regular expresions:

    String removeSecrets(String input){
        StringBuilder sb = new StringBuilder(input);
    
        String sidStart = "&amp;sid=";
        String sidEnd = "\"";
    
        int posStart = 0;
        while ((posStart = sb.indexOf(sidStart, posStart)) >= 0) {
            int posEnd = sb.indexOf(sidEnd, posStart);
            if (posEnd < 0)     // delete as far as possible - YMMV
                posEnd = sb.length();
            sb.delete(posStart, posEnd);
        }
    
        return sb.toString();
    }
    

    Edit 2

    Here is a small benchmark between StringBuilder and String.replaceAll:

    public class ReplaceAllBenchmark {
        public static void main(String[] args) throws Throwable {
            final int N = 1000000;
            String input = "<li class=\"icon-logout\"><a href=\"./ucp.php?mode=logout&amp;sid=3a4043284674572e35881e022c68fcd8\" title=\"Logout [ barry ]\" accesskey=\"x\">Logout [ barry ]</a>&amp;sid=3a4043284674572e35881e022c68fcd8\"</li>";
    
            stringBuilderBench(input, N);
            regularExpressionBench(input, N);
        }
    
        static void stringBuilderBench(String input, final int N) throws Throwable{
            for(int run=0; run<5; ++run){
                long t1 = System.nanoTime();
                for(int i=0; i<N; ++i)
                    removeSecrets(input);
                long t2 = System.nanoTime();
                System.out.println("sb: "+(t2-t1)+"ns, "+(t2-t1)/N+"ns/call");
                Thread.sleep(1000);
            }
        }
    
        static void regularExpressionBench(String input, final int N) throws Throwable{
            for(int run=0; run<5; ++run){
                long t1 = System.nanoTime();
                for(int i=0; i<N; ++i)
                    removeSecrets2(input);
                long t2 = System.nanoTime();
                System.out.println("regexp: "+(t2-t1)+"ns, "+(t2-t1)/N+"ns/call");
                Thread.sleep(1000);
            }
        }
    
        static String removeSecrets2(String input){
            return input.replaceAll("&amp;sid=[^\"]*\"", "\"");
        }
    }
    

    Results:

    java version "1.6.0_20"
    OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.04.2)
    OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
    
    sb: 538735438ns, 538ns/call
    sb: 457107726ns, 457ns/call
    sb: 443282145ns, 443ns/call
    sb: 453978805ns, 453ns/call
    sb: 458895308ns, 458ns/call
    regexp: 2404818405ns, 2404ns/call
    regexp: 2196834572ns, 2196ns/call
    regexp: 2239056178ns, 2239ns/call
    regexp: 2164337638ns, 2164ns/call
    regexp: 2177091893ns, 2177ns/call
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am using some threads in java android, I have a class which implements
I am using sqlite and java as programming language. I have a class which
I'm using Java. I want to have a setter method of one class that
I have a Java program that opens a file using the RandomAccessFile class. I'd
I have the following C# helper class which retrieves a Json String from a
I have a Java class which can be called from shell. (via java [command][options])
I have a java class which fires custom java events. The structure of the
I have a REST web service class which i call HttpRequest using curl.I wrote
I am using java language,I have a method that is supposed to return an
I have a Java based web-application using Java Server Faces and Facelets. I am

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.