Converts lists of lists/tuples into DataFrames with proper type inference and optional (e.g. string to datetime) conversion. Also enables iterating lazily over chunks of large files Parameters ---------- data : file-like object or list delimiter : separator character to
(*args, **kwds)
| 2024 | |
| 2025 | |
| 2026 | def TextParser(*args, **kwds) -> TextFileReader: |
| 2027 | """ |
| 2028 | Converts lists of lists/tuples into DataFrames with proper type inference |
| 2029 | and optional (e.g. string to datetime) conversion. Also enables iterating |
| 2030 | lazily over chunks of large files |
| 2031 | |
| 2032 | Parameters |
| 2033 | ---------- |
| 2034 | data : file-like object or list |
| 2035 | delimiter : separator character to use |
| 2036 | dialect : str or csv.Dialect instance, optional |
| 2037 | Ignored if delimiter is longer than 1 character |
| 2038 | names : sequence, default |
| 2039 | header : int, default 0 |
| 2040 | Row to use to parse column labels. Defaults to the first row. Prior |
| 2041 | rows will be discarded |
| 2042 | index_col : int or list, optional |
| 2043 | Column or columns to use as the (possibly hierarchical) index |
| 2044 | has_index_names: bool, default False |
| 2045 | True if the cols defined in index_col have an index name and are |
| 2046 | not in the header. |
| 2047 | na_values : scalar, str, list-like, or dict, optional |
| 2048 | Additional strings to recognize as NA/NaN. |
| 2049 | keep_default_na : bool, default True |
| 2050 | thousands : str, optional |
| 2051 | Thousands separator |
| 2052 | comment : str, optional |
| 2053 | Comment out remainder of line |
| 2054 | parse_dates : bool, default False |
| 2055 | date_format : str or dict of column -> format, default ``None`` |
| 2056 | |
| 2057 | .. versionadded:: 2.0.0 |
| 2058 | skiprows : list of integers |
| 2059 | Row numbers to skip |
| 2060 | skipfooter : int |
| 2061 | Number of line at bottom of file to skip |
| 2062 | converters : dict, optional |
| 2063 | Dict of functions for converting values in certain columns. Keys can |
| 2064 | either be integers or column labels, values are functions that take one |
| 2065 | input argument, the cell (not column) content, and return the |
| 2066 | transformed content. |
| 2067 | encoding : str, optional |
| 2068 | Encoding to use for UTF when reading/writing (ex. 'utf-8') |
| 2069 | float_precision : str, optional |
| 2070 | Specifies which converter the C engine should use for floating-point |
| 2071 | values. The options are `None` or `high` for the ordinary converter, |
| 2072 | `legacy` for the original lower precision pandas converter, and |
| 2073 | `round_trip` for the round-trip converter. |
| 2074 | """ |
| 2075 | kwds["engine"] = "python" |
| 2076 | return TextFileReader(*args, **kwds) |
| 2077 | |
| 2078 | |
| 2079 | def _clean_na_values(na_values, keep_default_na: bool = True, floatify: bool = True): |