hub / github.com/python/cpython / tokenize

Function tokenize

Lib/tokenize.py:472–498 · view source on GitHub ↗

The tokenize() generator requires one argument, readline, which must be a callable object which provides the same interface as the readline() method of built-in file objects. Each call to the function should return one line of input as bytes. Alternatively, readline can be a c

(readline)

Source from the content-addressed store, hash-verified

470	raise
471
472	def tokenize(readline):
473	"""
474	The tokenize() generator requires one argument, readline, which
475	must be a callable object which provides the same interface as the
476	readline() method of built-in file objects. Each call to the function
477	should return one line of input as bytes. Alternatively, readline
478	can be a callable function terminating with StopIteration:
479	readline = open(myfile, 'rb').__next__ # Example of alternate readline
480
481	The generator produces 5-tuples with these members: the token type; the
482	token string; a 2-tuple (srow, scol) of ints specifying the row and
483	column where the token begins in the source; a 2-tuple (erow, ecol) of
484	ints specifying the row and column where the token ends in the source;
485	and the line on which the token was found. The line passed is the
486	physical line.
487
488	The first token sequence will always be an ENCODING token
489	which tells you which encoding was used to decode the bytes stream.
490	"""
491	encoding, consumed = detect_encoding(readline)
492	rl_gen = _itertools.chain(consumed, iter(readline, b""))
493	if encoding is not None:
494	if encoding == "utf-8-sig":
495	# BOM will already have been stripped.
496	encoding = "utf-8"
497	yield TokenInfo(ENCODING, encoding, (0, 0), (0, 0), '')
498	yield from _generate_tokens_from_c_tokenizer(rl_gen.__next__, encoding, extra_tokens=True)
499
500	def generate_tokens(readline):
501	"""Tokenize a source reading Python code as unicode strings.

Callers 1

_mainFunction · 0.70

Calls 4

TokenInfoClass · 0.85

_generate_tokens_from_c_tokenizerFunction · 0.85

chainMethod · 0.80

detect_encodingFunction · 0.70

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…