MCPcopy
hub / github.com/pandas-dev/pandas / TextParser

Function TextParser

pandas/io/parsers/readers.py:2026–2076  ·  view source on GitHub ↗

Converts lists of lists/tuples into DataFrames with proper type inference and optional (e.g. string to datetime) conversion. Also enables iterating lazily over chunks of large files Parameters ---------- data : file-like object or list delimiter : separator character to

(*args, **kwds)

Source from the content-addressed store, hash-verified

2024
2025
2026def TextParser(*args, **kwds) -> TextFileReader:
2027 """
2028 Converts lists of lists/tuples into DataFrames with proper type inference
2029 and optional (e.g. string to datetime) conversion. Also enables iterating
2030 lazily over chunks of large files
2031
2032 Parameters
2033 ----------
2034 data : file-like object or list
2035 delimiter : separator character to use
2036 dialect : str or csv.Dialect instance, optional
2037 Ignored if delimiter is longer than 1 character
2038 names : sequence, default
2039 header : int, default 0
2040 Row to use to parse column labels. Defaults to the first row. Prior
2041 rows will be discarded
2042 index_col : int or list, optional
2043 Column or columns to use as the (possibly hierarchical) index
2044 has_index_names: bool, default False
2045 True if the cols defined in index_col have an index name and are
2046 not in the header.
2047 na_values : scalar, str, list-like, or dict, optional
2048 Additional strings to recognize as NA/NaN.
2049 keep_default_na : bool, default True
2050 thousands : str, optional
2051 Thousands separator
2052 comment : str, optional
2053 Comment out remainder of line
2054 parse_dates : bool, default False
2055 date_format : str or dict of column -> format, default ``None``
2056
2057 .. versionadded:: 2.0.0
2058 skiprows : list of integers
2059 Row numbers to skip
2060 skipfooter : int
2061 Number of line at bottom of file to skip
2062 converters : dict, optional
2063 Dict of functions for converting values in certain columns. Keys can
2064 either be integers or column labels, values are functions that take one
2065 input argument, the cell (not column) content, and return the
2066 transformed content.
2067 encoding : str, optional
2068 Encoding to use for UTF when reading/writing (ex. 'utf-8')
2069 float_precision : str, optional
2070 Specifies which converter the C engine should use for floating-point
2071 values. The options are `None` or `high` for the ordinary converter,
2072 `legacy` for the original lower precision pandas converter, and
2073 `round_trip` for the round-trip converter.
2074 """
2075 kwds["engine"] = "python"
2076 return TextFileReader(*args, **kwds)
2077
2078
2079def _clean_na_values(na_values, keep_default_na: bool = True, floatify: bool = True):

Callers 6

_data_to_frameFunction · 0.90
_data_to_frameFunction · 0.90
_parse_sheetMethod · 0.90
test_read_data_listFunction · 0.90
test_reader_listFunction · 0.90

Calls 1

TextFileReaderClass · 0.85

Tested by 3

test_read_data_listFunction · 0.72
test_reader_listFunction · 0.72