You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The email library preserves whitespace when a line is folded.
fromemailimportmessage_from_stringfromemail.policyimportdefaultmessage=message_from_string("Message-id:\r\n\t<abcdef@example.com>\r\n\r\n")
assertmessage['message-id'] =='<abcdef@example.com>', \
f"Expected '<abcdef@example.com>' but got {message['message-id']!r}"
The failure message says
Traceback (most recent call last):
File "/Users/myself/work/email-cpython-fork/repronnnn.py", line 5, in <module>
assert message['message-id'] == '<abcdef@example.com>', \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Expected '<abcdef@example.com>' but got '\r\n\t<abcdef@example.com>'
To quote RFC5322 section 3.2.2, 'any CRLF that appears in FWS is semantically "invisible"' (where "FWS" means "folding whitespace"). Later on, the section concludes, ' Runs of FWS, comment, or CFWS that occur between lexical tokens in a structured header field are semantically interpreted as a single space character' (where CFWS means runs of parenthesized comments and/or FWS).
(The Message-Id header is a structured field. Elsewhere in the RFC, the Subject: header, which is unstructured, is used in an example to illustrate this mechanism, though I'd still have to verify whether there is a gap in the RFC when it comes to specifying the semantics of non-CRLF FWS in unstructured header values.)
CPython versions tested on:
3.9, 3.11, CPython main branch
Operating systems tested on:
macOS
The text was updated successfully, but these errors were encountered:
Hi, I believe this issue also causes problems with software doing DKIM (RFC 6376) signature validation with the Python email library. The "simple" Header Canonicalization Algorithm specified in RFC 6376 is sensitive to whitespace, Thus, if the Python email library is used to extract headers from an email (to verify the DKIM signature) then this signature verification will fail (due to the whitespaces added by the Python library).
Bug report
Bug description:
The
emaillibrary preserves whitespace when a line is folded.The failure message says
To quote RFC5322 section 3.2.2, 'any CRLF that appears in FWS is semantically "invisible"' (where "FWS" means "folding whitespace"). Later on, the section concludes, ' Runs of FWS, comment, or CFWS that occur between lexical tokens in a structured header field are semantically interpreted as a single space character' (where CFWS means runs of parenthesized comments and/or FWS).
(The Message-Id header is a structured field. Elsewhere in the RFC, the Subject: header, which is unstructured, is used in an example to illustrate this mechanism, though I'd still have to verify whether there is a gap in the RFC when it comes to specifying the semantics of non-CRLF FWS in unstructured header values.)
CPython versions tested on:
3.9, 3.11, CPython main branch
Operating systems tested on:
macOS
The text was updated successfully, but these errors were encountered: