MCPcopy
hub / github.com/pandas-dev/pandas / SparseArray

Class SparseArray

pandas/core/arrays/sparse/array.py:296–1887  ·  view source on GitHub ↗

An ExtensionArray for storing sparse data. SparseArray efficiently stores data with a high frequency of a specific fill value (e.g., zeros), saving memory by only retaining non-fill elements and their indices. This class is particularly useful for large datasets where most valu

Source from the content-addressed store, hash-verified

294
295@set_module("pandas.arrays")
296class SparseArray(OpsMixin, PandasObject, ExtensionArray):
297 """
298 An ExtensionArray for storing sparse data.
299
300 SparseArray efficiently stores data with a high frequency of a
301 specific fill value (e.g., zeros), saving memory by only retaining
302 non-fill elements and their indices. This class is particularly
303 useful for large datasets where most values are redundant.
304
305 Parameters
306 ----------
307 data : array-like or scalar
308 A dense array of values to store in the SparseArray. This may contain
309 `fill_value`.
310 sparse_index : SparseIndex, optional
311 Index indicating the locations of sparse elements.
312 fill_value : scalar, optional
313 Elements in data that are ``fill_value`` are not stored in the
314 SparseArray. For memory savings, this should be the most common value
315 in `data`. By default, `fill_value` depends on the dtype of `data`:
316
317 =========== ==========
318 data.dtype na_value
319 =========== ==========
320 float ``np.nan``
321 int ``0``
322 bool False
323 datetime64 ``pd.NaT``
324 timedelta64 ``pd.NaT``
325 =========== ==========
326
327 The fill value is potentially specified in three ways. In order of
328 precedence, these are
329
330 1. The `fill_value` argument
331 2. ``dtype.fill_value`` if `fill_value` is None and `dtype` is
332 a ``SparseDtype``
333 3. ``data.dtype.fill_value`` if `fill_value` is None and `dtype`
334 is not a ``SparseDtype`` and `data` is a ``SparseArray``.
335
336 kind : str
337 Can be 'integer' or 'block', default is 'integer'.
338 The type of storage for sparse locations.
339
340 * 'block': Stores a `block` and `block_length` for each
341 contiguous *span* of sparse values. This is best when
342 sparse data tends to be clumped together, with large
343 regions of ``fill-value`` values between sparse values.
344 * 'integer': uses an integer to store the location of
345 each sparse value.
346
347 dtype : np.dtype or SparseDtype, optional
348 The dtype to use for the SparseArray. For numpy dtypes, this
349 determines the dtype of ``self.sp_values``. For SparseDtype,
350 this determines ``self.sp_values`` and ``self.fill_value``.
351 copy : bool, default False
352 Whether to explicitly copy the incoming `data` array.
353

Callers 15

setupMethod · 0.90
time_sparse_arrayMethod · 0.90
setupMethod · 0.90
make_block_arrayMethod · 0.90
setupMethod · 0.90
setupMethod · 0.90
setupMethod · 0.90
setupMethod · 0.90
_get_dummies_1dFunction · 0.90
create_blockFunction · 0.90
test_astypeMethod · 0.90
test_astype_boolMethod · 0.90

Calls

no outgoing calls

Tested by 15

create_blockFunction · 0.72
test_astypeMethod · 0.72
test_astype_boolMethod · 0.72
test_astype_allMethod · 0.72
test_get_attributesMethod · 0.72
test_to_cooMethod · 0.72
test_to_denseMethod · 0.72
test_densityMethod · 0.72
test_series_from_cooMethod · 0.72