Using Python, I’m trying to rename a series of .txt files in a directory

Question

0

Asked: May 28, 20262026-05-28T04:37:50+00:00 2026-05-28T04:37:50+00:00

Using Python, I’m trying to rename a series of .txt files in a directory

0

Using Python, I’m trying to rename a series of .txt files in a directory according to a specific phrase in each given text file. Put differently and more specifically, I have a few hundred text files with arbitrary names but within each file is a unique phrase (something like No. 85-2156). I would like to replace the arbitrary file name with that given phrase for every text file. The phrase is not always on the same line (though it doesn’t deviate that much) but it always is in the same format and with the No. prefix.

I’ve looked at the os module and I understand how

could be useful but I don’t understand how to combine those functions with intratext manipulation functions like linecache or general line reading functions.

I’ve thought through many ways of accomplishing this task but it seems like easiest and most efficient way would be to create a loop that finds the unique phrase in a file, assigns it to a variable and use that variable to rename the file before moving to the next file.

This seems like it should be easy, so much so that I feel silly writing this question. I’ve spent the last few hours looking reading documentation and parsing through StackOverflow but it doesn’t seem like anyone has quite had this issue before — or at least they haven’t asked about their problem.

Can anyone point me in the right direction?

EDIT 1: When I create the regex pattern using this website, it creates bulky but seemingly workable code:

import re

txt='No. 09-1159'

re1='(No)'  # Word 1
re2='(\\.)' # Any Single Character 1
re3='( )'   # White Space 1
re4='(\\d)' # Any Single Digit 1
re5='(\\d)' # Any Single Digit 2
re6='(-)'   # Any Single Character 2
re7='(\\d)' # Any Single Digit 3
re8='(\\d)' # Any Single Digit 4
re9='(\\d)' # Any Single Digit 5
re10='(\\d)'    # Any Single Digit 6

rg = re.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10,re.IGNORECASE|re.DOTALL)
m = rg.search(txt)
name = m.group(0)
print name

When I manipulate that to fit the glob.glob structure, and make it like this:

import glob
import os
import re

re1='(No)'  # Word 1
re2='(\\.)' # Any Single Character 1
re3='( )'   # White Space 1
re4='(\\d)' # Any Single Digit 1
re5='(\\d)' # Any Single Digit 2
re6='(-)'   # Any Single Character 2
re7='(\\d)' # Any Single Digit 3
re8='(\\d)' # Any Single Digit 4
re9='(\\d)' # Any Single Digit 5
re10='(\\d)'    # Any Single Digit 6

rg = re.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10,re.IGNORECASE|re.DOTALL)

for fname in glob.glob("\file\structure\here\*.txt"):
    with open(fname) as f:
        contents = f.read()
    tname = rg.search(contents)
    print tname

Then this prints out the byte location of the the pattern — signifying that the regex pattern is correct. However, when I add in the nname = tname.group(0) line after the original tname = rg.search(contents) and change around the print function to reflect the change, it gives me the following error: AttributeError: ‘NoneType’ object has no attribute ‘group’. When I tried copying and pasting @joaquin’s code line for line, it came up with the same error. I was going to post this as a comment to the @spatz answer but I wanted to include so much code that this seemed to be a better way to express the `new’ problem. Thank you all for the help so far.

Edit 2: This is for the @joaquin answer below:

import glob
import os
import re

for fname in glob.glob("/directory/structure/here/*.txt"):
    with open(fname) as f:
        contents = f.read()
    tname = re.search('No\. (\d\d\-\d\d\d\d)', contents)
    nname = tname.group(1)
    print nname

Last Edit: I got it to work using mostly the code as written. What was happening is that there were some files that didn’t have that regex expression so I assumed Python would skip them. Silly me. So I spent three days learning to write two lines of code (I know the lesson is more than that). I also used the error catching method recommended here. I wish I could check all of you as the answer, but I bothered @Joaquin the most so I gave it to him. This was a great learning experience. Thank you all for being so generous with your time. The final code is below.

import os
import re

pat3 = "No\. (\d\d-\d\d)"
ext = '.txt'
mydir = '/directory/files/here'


for arch in os.listdir(mydir):
    archpath = os.path.join(mydir, arch)
    with open(archpath) as f:
        txt = f.read()
    s = re.search(pat3, txt)
    if s is None:
        continue    
    name = s.group(1)
    newpath = os.path.join(mydir, name)
    if not os.path.exists(newpath):
        os.rename(archpath, newpath + ext)
    else:
        print '{} already exists, passing'.format(newpath)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T04:37:50+00:00

There is no checking or protection for failures (check is archpath is a file, if newpath already exists, if the search is succesful, etc…), but this should work:

import os
import re

pat = "No\. (\d\d\-\d\d\d\d)"
mydir = 'mydir'
for arch in os.listdir(mydir):
    archpath = os.path.join(mydir, arch)
    with open(archpath) as f:
        txt = f.read()
    s = re.search(pat, txt)
    name = s.group(1)
    newpath = os.path.join(mydir, name)
    os.rename(archpath, newpath)

Edit: I tested the regex to show how it works:

>>> import re
>>> pat = "No\. (\d\d\-\d\d\d\d)"
>>> txt='nothing here or whatever No. 09-1159 you want, does not matter'
>>> s = re.search(pat, txt)
>>> s.group(1)
'09-1159'
>>>

The regex is very simple:

\. -> a dot
\d -> a decimal digit
\- -> a dash

So, it says: search for the string "No. " followed by 2+4 decimal digits separated by a dash.
The parentheses are to create a group that I can recover with s.group(1) and that contains the code number.

And that is what you get, before and after:

enter image description here

Text of files one.txt, two.txt and three.txt is always the same, only the number changes:

this is the first
file with a number
nothing here or whatever No. 09-1159 you want, does not matter
the number is

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Using Python, I’m trying to rename a series of .txt files in a directory

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply