There is this particular kind of Amazon message that seems to throw Indy’s MessagePart parser.
The message is structured (strongly abridged version, of course) as such:
Content-Type: multipart/mixed;
boundary="----=_Part_853547_18414509.1354745829993"
<some irrelevant header stuff>
------=_Part_853547_18414509.1354745829993
Content-Type: multipart/alternative;
boundary="----=_Part_853548_20128671.1354745829993"
------=_Part_853548_20128671.1354745829993
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<the message in plain text>
------=_Part_853548_20128671.1354745829993
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<the message in HTML>
------=_Part_853548_20128671.1354745829993--
------=_Part_853547_18414509.1354745829993--
Now, when I perform a
imap.UIDRetrieve(UID,Msg)
then
Msg.ContentType = "multipart/mixed"
and the individual Msg.MessageParts have this as content-type:
Msg.MessageParts[0].ContentType = "multipart/alternative; boundary="----=_Part_853548_20128671.1354745829993""
Msg.MessageParts[1].ContentType = "text/plain"
No trace of the text/html part.
Would anyone have any idea what is going on here?
(am using very latest Indy build)
When I run that email data as-is directly through
TIdMessageusing the current Indy 10 SVN snapshot, it parses just fine. ThreeMessagePartsentries are generated –multipart/alternative,text/plain, andtext/html– as expected.In your recent post to the Embarcadero forum on this same subject, you left out a key piece of information – you are using
TIdIMAP4to retreive the emails that are failing for you. That is important because the portions of the email that you consider irrelevant must be containing data that is hitting a piece of Indy code that has a known design limitation that will not be fixed in Indy 10 (but is already marked as required for Indy 11) but which affectsTIdIMAP4.Internally,
TIdIMAP4.UIDRetreive()passes the raw downloaded email as-is toTIdMessage.LoadFromStream(). The core parser inside ofTIdMessageexpects the input data to be escaped using an SMTP-style dot transparency that IMAP does not actually use.TIdMessagecurrently does not have any way of knowing the source of the input email data, and consequently its formatting for escaped data. It is the responsibility of higher level protocol transports to parse and decode escaped data as needed, and then should pass the unescaped data toTIdMessagefor further parsing. But that is not actually the case right now. That separation of logic is not in place in Indy 10 as much as it should be.TIdMessageuses direct access to the source data, which works fine inTIdSMTPandTIdPOP3but not inTIdIMAP4.In the
IdMessageClient.pasunit, there is a helper class namedTIdIOHandlerStreamMsgthat has a publicEscapeLinesproperty that was specifically designed to help address this issue in situations that need it (namely, passing un-escaped data toTIdMessage.LoadFrom...()methods). Instead of callingTIdMessage.LoadFromStream()directly, the current workaround is to instantiate aTIdMessageClient, assign aTIdIOHandlerStreamMsgto itsIOHandlerproperty, set itsEscapeLinesproperty to True, and pass it the sourceTStreamto parse and destinationTIdMessageto output to. This allows Indy to fake the escaping formatting that the core parser is expecting to unescape.However,TIdIMAP4does not currently utilize this workaround yet. Now that I think about this, it should be easy to add, so I will look into it. In the meantime, you can useTIdIMAP4.RetrieveNoDecodeToStream()orTIdIMAP4.UIDRetrieveNoDecodeToStream(), which will indirectly escape the data when writing to the destinationTStream(another known design limitation that needs fixing), and then you can pass thatTStreamtoTIdMessage.LoadFromStream()to parse normally.Update: I have just checked in an update to the
AppendMsg()andInternalRetrieve()methods ofTIdIMAP4to no longer rely on SMTP-style dot transparency. This has been on Indy’s TODO list for several years now, so it is nice to final address it.