I’ve 2 folders. The first (called A) contains same images named in the form: subject_incrementalNumber.jpg (where incrementalNumber goes from 0 to X).
Then I process each image contained in folder A and extract some pieces from it, then save each piece in folder B with the name: subject(the same of the original image contained in folder A)_incrementalNumber(the same of folder A)_anotherIncrementalNumber(that distinguish one piece from another).
Finally, I delete the processed image from folder A.
A
subjectA_0.jpg
subjectA_1.jpg
subjectA_2.jpg
...
subjectB_0.jpg
B
subjectA_0_0.jpg
subjectA_0_1.jpg
subjectA_1_0.jpg
subjectA_2_0.jpg
...
Everytime I download a new image of one subject and save it in folder A, I have to calculate a new pathname for this image (I have to found the min incrementalNumber available for the specific subject). The problem is that when I process an image I delete it from folder A and I store only the pieces in folder B, so I have to find the min number available in both folders.
Now I use the following function to create the pathname
output_name = chooseName( subject, folderA, folderB )
# Create incremental file
# If the name already exist, try with incremental number (0, 1, etc.)
def chooseName( owner, dest_files, faces_files ):
# found the min number available in both folders
v1 = seekVersion_downloaded( owner, dest_files )
v2 = seekVersion_faces( owner, faces_files )
# select the max from those 2
version = max( v1, v2 )
# create name
base = dest_files + os.sep + owner + "_"
fname = base + str(version) + ".jpg"
return fname
# Seek the min number available in folderA
def seekVersion_folderA( owner, dest_files ):
def f(x):
if fnmatch.fnmatch(x, owner + '_*.jpg'): return x
res = filter( f, dest_files )
def g(x): return int(x[x.find("_")+1:-len(".jpg")])
numbers = map( g, res )
if len( numbers ) == 0: return 0
else: return int(max(numbers))+1
# Seek the min number available in folderB
def seekVersion_folderB( owner, faces_files ):
def f(x):
if fnmatch.fnmatch(x, owner + '_*_*.jpg'): return x
res = filter( f, faces_files )
def g(x): return int(x[x.find("_")+1:x.rfind("_")])
numbers = map( g, res )
if len( numbers ) == 0: return 0
else: return int(max(numbers))+1
It works, but this process take about 10seconds for each image, and since I have a lot of images this is too inefficient.
There is any workaround to make it faster?
I’ve found another solution: use the hash of the file as unique file name