Suppose I want to bruteforce webpages: for example, http://www.example.com/index.php?id= <1 – 99999> and search

Question

0

Asked: May 21, 20262026-05-21T10:55:21+00:00 2026-05-21T10:55:21+00:00

Suppose I want to bruteforce webpages: for example, http://www.example.com/index.php?id= <1 – 99999> and search

0

Suppose I want to bruteforce webpages:

for example, http://www.example.com/index.php?id=<1 – 99999>
and search each page to find if there contain a certain text.
If the page contains the text, then store it into a string

I kind of get it working in python, but it is quite slow (around 1-2 second per page, which would take around 24 hours to do that) is there a better solution? I am thinking of using C/C++, because I heard python is not very efficient. However, after a second thought, I think it might not be the efficiency of python, but rather the efficiency of accessing html element (I changed the whole html into a text, then search it… and the content is quite long)

So how can I improve the speed of bruteforcing?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T10:55:22+00:00

Most likely your problem has nothing to do with your ability to quickly parse HTML and everything to do with the latency of page retrieval and blocking on sequential tasks.

1-2 seconds is a reasonable amount of time to retrieve a page. You should be able to find text on the page orders of magnitude faster. However, if you are processing pages one at a time you are blocked waiting for a response from a web server while you could be finding your results. You could instead retrieve multiple pages at once via worker processes and wait only for their output.

The following code has been modified from Python’s multiprocessing docs to fit your problem a little more closely.

import urllib
from multiprocessing import Process, Queue

def worker(input, output):
  for func, args in iter(input.get, 'STOP'):
    result = func(*args)
    output.put(result)

def find_on_page(num):
  uri = 'http://www.example.com/index.php?id=%d' % num
  f = urllib.urlopen(uri)
  data = f.read()
  f.close()
  index = data.find('datahere:') # obviously use your own methods
  if index < 0:
    return None
  else:
    return data[index:index+20]

def main():
  NUM_PROCESSES = 4
  tasks = [(find_on_page, (i,)) for i in range(99999)]
  task_queue = Queue()
  done_queue = Queue()
  for task in tasks:
    task_queue.put(task)
  for i in range(NUM_PROCESSES):
    Process(target=worker, args=(task_queue, done_queue)).start()
  for i in range(99999):
    print done_queue.get()
  for i in range(NUM_PROCESSES):
    task_queue.put('STOP')

if __name__ == "__main__":
  main()

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Suppose I want to bruteforce webpages: for example, http://www.example.com/index.php?id= <1 – 99999> and search

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply