I use python 2.7, and I have a simple multitheaded md5 dict brute:
# -*- coding: utf-8 -*-
import md5
import Queue
import threading
import traceback
md5_queue = Queue.Queue()
def Worker(queue):
while True:
try:
item = md5_queue.get_nowait()
except Queue.Empty:
break
try:
work(item)
except Exception:
traceback.print_exc()
queue.task_done()
def work(param):
with open('pwds', 'r') as f:
pwds = [x.strip() for x in f.readlines()]
for pwd in pwds:
if md5.new(pwd).hexdigest() == param:
print '%s:%s' % (pwd, md5.new(pwd).hexdigest())
def main():
global md5_queue
md5_lst = []
threads = 5
with open('md5', "r") as f:
md5_lst = [x.strip() for x in f.readlines()]
for m in md5_lst:
md5_queue.put(m) # add md5 hash to queue
for i in xrange(threads):
t = threading.Thread(target=Worker, args=(md5_queue,))
t.start()
md5_queue.join()
if __name__ == '__main__':
main()
Work in 5 threads. Each thread reads one hash from queue and checks it with list of passwords. Pretty simple: 1 thread 1 check in ‘for’ loop.
I want to have a little bit more: 1 thread and few threads to check passwords. So work() should read hash from queue and start a new number of threads to check passwords (1 thread hash, 10 thread there check for passwords). For example: 20 threads with hash and 20 threads to brute the hash in each thread. Something like that.
How can I do this?
P.S. Sorry for my explanation, ask if you did not understood what I want.
P.P.S. It’s not about bruting md5, it’s about multi-threading.
Thanks.
The default implementation of Python (called CPython) uses a Global Interpreter Lock (GIL) that effectively only allows one thread to be running at once. For I/O bound multithreaded applications, this is not usually a problem, but for CPU-bound applications like yours, it means you’re not going to see much of a multicore speedup.
I’d suggest using a different Python implementation that doesn’t have a GIL such as Jython, or rewriting your code to use a different language that doesn’t have a GIL. Writing it in natively compiled code is a good idea, but most scripting languages that have an MD5 function usually implement that in native code anyways, so I honestly wouldn’t expect much of a speedup between a natively compiled language and a scripting language.