I have a bunch of messages in a gmail mailbox with slightly mangled headers. I’d like to process and update them automatically using python and gmail’s imap interface. I’d like to download the messages, modify the headers locally, then delete it on the server, then add the fixed message back. The problem is that while the message does seem to be properly deleted, after adding it back the old, bad headers are still present. Complicating matters, if I manually delete the message in gmail and then add the message with the same command in python, the new, good headers appear as desired.
>>> import imaplib
>>> import email
>>> mail = imaplib.IMAP4_SSL('imap.gmail.com')
>>> mail.login('user@gmail.com', 'password')
>>> mail.select('label')
('OK', ['2'])
>>> mail.search(None, 'from', 'bad_string')
('OK', ['2'])
>>> ret,data = mail.fetch('2', '(RFC822)')
>>> msg = email.message_from_string(data1[0][1])
>>> msg['from']
'"Doe, John" <john.doe@bad_string.com>'
>>> new = msg['from'].replace('bad_string', 'good_string')
>>> msg.replace_header('From', new)
>>> msg['from']
'"Doe, John" <john.doe@good_string.com>'
>>> mail.store('2', '+FLAGS', '\\Deleted')
('OK', ['2 (FLAGS (\\Seen \\Deleted))'])
>>> mail.expunge()
('OK', ['2'])
>>> mail.search(None, 'from', 'bad_string')
('OK', [''])
>>> mail.select('label')
('OK', ['1'])
At this point, it seems like python sees the email as deleted. Checking in gmail’s web interface seems to show it as gone, too. There is only one email in the label instead of two at the beginning and the search returns empty.
>>> mail.append('label', None, '"20-Jul-2012 22:30:00 -0400"', str(msg))
('OK', ['[APPENDUID 24 13] (Success)'])
>>> mail.search(None, 'from', 'bad_string')
('OK', ['2'])
>>> mail.search(None, 'from', 'good_string')
('OK', [''])
But the message is back with it’s original bad string. However, if instead of programmatically marking it as deleted and expunging, I delete and empty the trash in gmail’s web interface and then append (still in the same python session as above so running this right after the above output)…
>>> mail.append('label', None, '"20-Jul-2012 22:30:00 -0400"', str(msg))
('OK', ['[APPENDUID 24 14] (Success)'])
>>> mail.search(None, 'from', 'bad_string')
('OK', [''])
>>> mail.search(None, 'from', 'good_string')
('OK', ['2'])
The IMAP settings in gmail are as follows:
- When I mark a message in IMAP as deleted: Auto-Expunge off – Wait for the client to update the server.
- When a message is marked as deleted and expunged from the last visible IMAP folder: Immediately delete the message forever
So of course the answer uncloaks shortly after posting the question… The significant detail is the line “When a message is marked as deleted and expunged from the last visible IMAP folder“. By default, All Mail is selected as a visible IMAP folder and so the message will always be visible in at least one folder. I believe there are two solutions:
Unselect All Mail as a visible IMAP folder. Then the “When a message is marked…” option behaves as I would have initially expected it to when deleting and expunging from a label. If “archive the message” is selected, it simply removes the label and it is still visible in All Mail. If “move the message to Trash” is selected, the message is moved to the trash label. If “permanently delete forever” is selected, the message is completely deleted.
With all the default folders selected as visible IMAP folders and the default IMAP settings, instead of performing any operations on the message in it’s current label(s) you can instead copy the message to the “[Gmail]/Trash” label, then select the trash label and delete/expunge everything in there. This permanently deletes the message.