hub / github.com/pandas-dev/pandas / arrow_table_to_pandas

Function arrow_table_to_pandas

pandas/io/_util.py:78–132 · view source on GitHub ↗

(
    table: pyarrow.Table,
    dtype_backend: DtypeBackend | Literal["numpy"] | lib.NoDefault = lib.no_default,
    null_to_int64: bool = False,
    to_pandas_kwargs: dict | None = None,
    dtype: DtypeArg | None = None,
    names: Sequence[Hashable] | None = None,
)

Source from the content-addressed store, hash-verified

76
77
78	def arrow_table_to_pandas(
79	table: pyarrow.Table,
80	dtype_backend: DtypeBackend \| Literal["numpy"] \| lib.NoDefault = lib.no_default,
81	null_to_int64: bool = False,
82	to_pandas_kwargs: dict \| None = None,
83	dtype: DtypeArg \| None = None,
84	names: Sequence[Hashable] \| None = None,
85	) -> pd.DataFrame:
86	pa = import_optional_dependency("pyarrow")
87
88	to_pandas_kwargs = {} if to_pandas_kwargs is None else to_pandas_kwargs
89
90	types_mapper: type[pd.ArrowDtype] \| None \| Callable
91	if dtype_backend == "numpy_nullable":
92	mapping = _arrow_dtype_mapping()
93	if null_to_int64:
94	# Modify the default mapping to also map null to Int64
95	# (to match other engines - only for CSV parser)
96	mapping[pa.null()] = pd.Int64Dtype()
97	types_mapper = mapping.get
98	elif dtype_backend == "pyarrow":
99	types_mapper = pd.ArrowDtype
100	elif using_string_dtype():
101	if pa_version_under19p0:
102	types_mapper = _arrow_string_types_mapper()
103	elif dtype is not None:
104	# GH#56136 Avoid lossy conversion to float64
105	# We'll convert to numpy below if
106	types_mapper = {
107	pa.int8(): pd.Int8Dtype(),
108	pa.int16(): pd.Int16Dtype(),
109	pa.int32(): pd.Int32Dtype(),
110	pa.int64(): pd.Int64Dtype(),
111	}.get
112	else:
113	types_mapper = None
114	elif dtype_backend is lib.no_default or dtype_backend == "numpy":
115	if dtype is not None:
116	# GH#56136 Avoid lossy conversion to float64
117	# We'll convert to numpy below if
118	types_mapper = {
119	pa.int8(): pd.Int8Dtype(),
120	pa.int16(): pd.Int16Dtype(),
121	pa.int32(): pd.Int32Dtype(),
122	pa.int64(): pd.Int64Dtype(),
123	}.get
124	else:
125	types_mapper = None
126	else:
127	raise NotImplementedError
128
129	df = table.to_pandas(types_mapper=types_mapper, **to_pandas_kwargs)
130	df = _post_convert_dtypes(df, dtype_backend, dtype, names)
131	df = _normalize_timezone_dtypes(df)
132	return df
133
134
135	def _post_convert_dtypes(

Callers 10

read_featherFunction · 0.90

read_orcFunction · 0.90

readMethod · 0.90

read_tableMethod · 0.90

read_queryMethod · 0.90

readMethod · 0.90

_read_pyarrowMethod · 0.90

test_arrow_table_to_pandas_normalize_timezonesFunction · 0.90

test_arrow_table_to_pandas_normalize_timezones_columnsFunction · 0.90

test_arrow_table_to_pandas_normalize_timezones_multiindexFunction · 0.90

Calls 6

import_optional_dependencyFunction · 0.90

using_string_dtypeFunction · 0.90

_arrow_dtype_mappingFunction · 0.85

_arrow_string_types_mapperFunction · 0.85

_post_convert_dtypesFunction · 0.85

_normalize_timezone_dtypesFunction · 0.85

Tested by 3

test_arrow_table_to_pandas_normalize_timezonesFunction · 0.72

test_arrow_table_to_pandas_normalize_timezones_columnsFunction · 0.72

test_arrow_table_to_pandas_normalize_timezones_multiindexFunction · 0.72