hub / github.com/pandas-dev/pandas / TextParser

Function TextParser

pandas/io/parsers/readers.py:2026–2076 · view source on GitHub ↗

Converts lists of lists/tuples into DataFrames with proper type inference and optional (e.g. string to datetime) conversion. Also enables iterating lazily over chunks of large files Parameters ---------- data : file-like object or list delimiter : separator character to

(*args, **kwds)

Source from the content-addressed store, hash-verified

2024
2025
2026	def TextParser(args, *kwds) -> TextFileReader:
2027	"""
2028	Converts lists of lists/tuples into DataFrames with proper type inference
2029	and optional (e.g. string to datetime) conversion. Also enables iterating
2030	lazily over chunks of large files
2031
2032	Parameters
2033	----------
2034	data : file-like object or list
2035	delimiter : separator character to use
2036	dialect : str or csv.Dialect instance, optional
2037	Ignored if delimiter is longer than 1 character
2038	names : sequence, default
2039	header : int, default 0
2040	Row to use to parse column labels. Defaults to the first row. Prior
2041	rows will be discarded
2042	index_col : int or list, optional
2043	Column or columns to use as the (possibly hierarchical) index
2044	has_index_names: bool, default False
2045	True if the cols defined in index_col have an index name and are
2046	not in the header.
2047	na_values : scalar, str, list-like, or dict, optional
2048	Additional strings to recognize as NA/NaN.
2049	keep_default_na : bool, default True
2050	thousands : str, optional
2051	Thousands separator
2052	comment : str, optional
2053	Comment out remainder of line
2054	parse_dates : bool, default False
2055	date_format : str or dict of column -> format, default ``None``
2056
2057	.. versionadded:: 2.0.0
2058	skiprows : list of integers
2059	Row numbers to skip
2060	skipfooter : int
2061	Number of line at bottom of file to skip
2062	converters : dict, optional
2063	Dict of functions for converting values in certain columns. Keys can
2064	either be integers or column labels, values are functions that take one
2065	input argument, the cell (not column) content, and return the
2066	transformed content.
2067	encoding : str, optional
2068	Encoding to use for UTF when reading/writing (ex. 'utf-8')
2069	float_precision : str, optional
2070	Specifies which converter the C engine should use for floating-point
2071	values. The options are `None` or `high` for the ordinary converter,
2072	`legacy` for the original lower precision pandas converter, and
2073	`round_trip` for the round-trip converter.
2074	"""
2075	kwds["engine"] = "python"
2076	return TextFileReader(args, *kwds)
2077
2078
2079	def _clean_na_values(na_values, keep_default_na: bool = True, floatify: bool = True):

Callers 6

_data_to_frameFunction · 0.90

_parse_sheetMethod · 0.90

test_read_data_listFunction · 0.90

test_reader_listFunction · 0.90

test_reader_list_skiprowsFunction · 0.90

Calls 1

TextFileReaderClass · 0.85

Tested by 3

test_read_data_listFunction · 0.72

test_reader_listFunction · 0.72

test_reader_list_skiprowsFunction · 0.72