I have an email client in Django. Currently supporting GMail accounts using imaplib.
My problem is: I want to obtain the attachment names without having to download the full email. Currently, in order to obtain the attachment names, or the email body, I need to download the whole email using the fetch function with the parameter (RFC822).
I know I can obtain specific fields only using HEADER.FIELDS, for the subject, from, cc for example. But is there a way to obtain the attachment names or the email body without downloading the whole email?
What I mean specifically is: let’s say I have a 30Mb email that has one line of text in the body and two 15Mb attachments. I want to obtain the attachment names and that line of text without downloading the full 30Mb body.
Thank you
Assuming you’re asking what I think you’re asking, here’s what to do:
First, fetch the
BODYSTRUCTURE. Assuming gmail’s IMAP server supports this, you’ll get back something like this:And then fetch the
(BODY ENVELOPE)is the structure has one.If you look at RFC3501 7.4.2, it explains how to deal with these.
Once you’ve determined that the
(BODY[1])and(BODY[2])are the plain-text and HTML versions of the main content, and(BODY[3])is the first real attachment, you download the plain-text body by fetching(BODY[1]), and you’ve got the name of the attachment from the structure.Sorry there’s no code here. I don’t think either
imaplibor any of the stdlib MIME- and mail-related modules will do the hard part for you (interpreting the structure), but I haven’t actually checked, so I’d look there first, and, if not, go to PyPI to see if anyone else has already written the code.Well, actually, first I’d just fetch
BODYSTRUCTURE,(BODY ENVELOPE)and(BODY[3])for a specific message to make sure gmail has complete support before writing a whole mess of code…PS, if worst comes to worst, if your use case is as simple and rigid as you described, you can just always fetch
BODYSTRUCTUREand(BODY[1]), fall back toRFC822if that fails, and get the attachment names by running a hacky regexp on the structure instead of a real parse. I wouldn’t write this for anything but a one-shot script or a quick&dirty prototype to learn about gmail, but for those cases, I probably would.