hub / github.com/python/cpython / Charset

Class Charset

Lib/email/charset.py:162–398 · view source on GitHub ↗

Map character sets to their email properties. This class provides information about the requirements imposed on email for a specific character set. It also provides convenience routines for converting between character sets, given the availability of the applicable codecs. Given a

Source from the content-addressed store, hash-verified

160
161
162	class Charset:
163	"""Map character sets to their email properties.
164
165	This class provides information about the requirements imposed on email
166	for a specific character set. It also provides convenience routines for
167	converting between character sets, given the availability of the
168	applicable codecs. Given a character set, it will do its best to provide
169	information on how to use that character set in an email in an
170	RFC-compliant way.
171
172	Certain character sets must be encoded with quoted-printable or base64
173	when used in email headers or bodies. Certain character sets must be
174	converted outright, and are not allowed in email. Instances of this
175	module expose the following information about a character set:
176
177	input_charset: The initial character set specified. Common aliases
178	are converted to their 'official' email names (e.g. latin_1
179	is converted to iso-8859-1). Defaults to 7-bit us-ascii.
180
181	header_encoding: If the character set must be encoded before it can be
182	used in an email header, this attribute will be set to
183	charset.QP (for quoted-printable), charset.BASE64 (for
184	base64 encoding), or charset.SHORTEST for the shortest of
185	QP or BASE64 encoding. Otherwise, it will be None.
186
187	body_encoding: Same as header_encoding, but describes the encoding for the
188	mail message's body, which indeed may be different than the
189	header encoding. charset.SHORTEST is not allowed for
190	body_encoding.
191
192	output_charset: Some character sets must be converted before they can be
193	used in email headers or bodies. If the input_charset is
194	one of them, this attribute will contain the name of the
195	charset output will be converted to. Otherwise, it will
196	be None.
197
198	input_codec: The name of the Python codec used to convert the
199	input_charset to Unicode. If no conversion codec is
200	necessary, this attribute will be None.
201
202	output_codec: The name of the Python codec used to convert Unicode
203	to the output_charset. If no conversion codec is necessary,
204	this attribute will have the same value as the input_codec.
205	"""
206	def __init__(self, input_charset=DEFAULT_CHARSET):
207	# RFC 2046, $4.1.2 says charsets are not case sensitive. We coerce to
208	# unicode because its .lower() is locale insensitive. If the argument
209	# is already a unicode, we leave it at that, but ensure that the
210	# charset is ASCII, as the standard (RFC XXX) requires.
211	try:
212	if isinstance(input_charset, str):
213	input_charset.encode('ascii')
214	else:
215	input_charset = str(input_charset, 'ascii')
216	except UnicodeError:
217	raise errors.CharsetError(input_charset)
218	input_charset = input_charset.lower()
219	# Set the input charset after filtering through the aliases

Callers 15

formataddrFunction · 0.90

test_getset_charsetMethod · 0.90

test_set_payload_with_charsetMethod · 0.90

test_set_payload_with_8bit_data_and_charsetMethod · 0.90

test_set_payload_with_non_ascii_and_charset_body_encoding_noneMethod · 0.90

test_set_payload_with_8bit_data_and_charset_body_encoding_noneMethod · 0.90

test_long_nonstringMethod · 0.90

test_charsetMethod · 0.90

test_accepts_any_charset_like_objectMethod · 0.90

test_charset_richcomparisonsMethod · 0.90

test_get_body_encoding_with_bogus_charsetMethod · 0.90

test_get_body_encoding_with_uppercase_charsetMethod · 0.90

Calls

no outgoing calls

Tested by 15

test_getset_charsetMethod · 0.72

test_set_payload_with_charsetMethod · 0.72

test_set_payload_with_8bit_data_and_charsetMethod · 0.72

test_set_payload_with_non_ascii_and_charset_body_encoding_noneMethod · 0.72

test_set_payload_with_8bit_data_and_charset_body_encoding_noneMethod · 0.72

test_long_nonstringMethod · 0.72

test_charsetMethod · 0.72

test_accepts_any_charset_like_objectMethod · 0.72

test_charset_richcomparisonsMethod · 0.72

test_get_body_encoding_with_bogus_charsetMethod · 0.72

test_get_body_encoding_with_uppercase_charsetMethod · 0.72

test_charsets_case_insensitiveMethod · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…