hub / github.com/pandas-dev/pandas / from_dummies

Function from_dummies

pandas/core/reshape/encoding.py:372–589 · view source on GitHub ↗

Create a categorical ``DataFrame`` from a ``DataFrame`` of dummy variables. Inverts the operation performed by :func:`~pandas.get_dummies`. Parameters ---------- data : DataFrame Data which contains dummy-coded variables in form of integer columns of 1's and 0'

(
    data: DataFrame,
    sep: None | str = None,
    default_category: None | Hashable | dict[str, Hashable] = None,
)

Source from the content-addressed store, hash-verified

370
371	@set_module("pandas")
372	def from_dummies(
373	data: DataFrame,
374	sep: None \| str = None,
375	default_category: None \| Hashable \| dict[str, Hashable] = None,
376	) -> DataFrame:
377	"""
378	Create a categorical ``DataFrame`` from a ``DataFrame`` of dummy variables.
379
380	Inverts the operation performed by :func:`~pandas.get_dummies`.
381
382	Parameters
383	----------
384	data : DataFrame
385	Data which contains dummy-coded variables in form of integer columns of
386	1's and 0's.
387	sep : str, default None
388	Separator used in the column names of the dummy categories they are
389	character indicating the separation of the categorical names from the prefixes.
390	For example, if your column names are 'prefix_A' and 'prefix_B',
391	you can strip the underscore by specifying sep='_'.
392	default_category : None, Hashable or dict of Hashables, default None
393	The default category is the implied category when a value has none of the
394	listed categories specified with a one, i.e. if all dummies in a row are
395	zero. Can be a single value for all variables or a dict directly mapping
396	the default categories to a prefix of a variable. The default category
397	will be coerced to the dtype of ``data.columns`` if such coercion is
398	lossless, and will raise otherwise.
399
400	Returns
401	-------
402	DataFrame
403	Categorical data decoded from the dummy input-data.
404
405	Raises
406	------
407	ValueError
408	* When the input ``DataFrame`` ``data`` contains NA values.
409	* When the input ``DataFrame`` ``data`` contains column names with separators
410	that do not match the separator specified with ``sep``.
411	* When a ``dict`` passed to ``default_category`` does not include an implied
412	category for each prefix.
413	* When a value in ``data`` has more than one category assigned to it.
414	* When ``default_category=None`` and a value in ``data`` has no category
415	assigned to it.
416	TypeError
417	* When the input ``data`` is not of type ``DataFrame``.
418	* When the input ``DataFrame`` ``data`` contains non-dummy data.
419	* When the passed ``sep`` is of a wrong data type.
420	* When the passed ``default_category`` is of a wrong data type.
421
422	See Also
423	--------
424	:func:`~pandas.get_dummies` : Convert ``Series`` or ``DataFrame`` to dummy codes.
425	:class:`~pandas.Categorical` : Represent a categorical variable in classic.
426
427	Notes
428	-----
429	The columns of the passed dummy data should only include 1's and 0's,

Callers 15

test_error_wrong_data_typeFunction · 0.90

test_error_no_prefix_contains_unassignedFunction · 0.90

test_error_no_prefix_wrong_default_category_typeFunction · 0.90

test_error_no_prefix_multi_assignmentFunction · 0.90

test_error_no_prefix_contains_nanFunction · 0.90

test_error_contains_non_dummiesFunction · 0.90

test_error_with_prefix_multiple_separatorsFunction · 0.90

test_error_with_prefix_sep_wrong_typeFunction · 0.90

test_error_with_prefix_contains_unassignedFunction · 0.90

test_error_with_prefix_default_category_wrong_typeFunction · 0.90

test_error_with_prefix_default_category_dict_not_completeFunction · 0.90

test_error_with_prefix_contains_nanFunction · 0.90

Calls 15

concatFunction · 0.90

DataFrameClass · 0.90

splitMethod · 0.80

get_indexer_forMethod · 0.80

anyMethod · 0.45

isnaMethod · 0.45

idxmaxMethod · 0.45

astypeMethod · 0.45

appendMethod · 0.45

itemsMethod · 0.45

copyMethod · 0.45

sumMethod · 0.45

Tested by 15

test_error_wrong_data_typeFunction · 0.72

test_error_no_prefix_contains_unassignedFunction · 0.72

test_error_no_prefix_wrong_default_category_typeFunction · 0.72

test_error_no_prefix_multi_assignmentFunction · 0.72

test_error_no_prefix_contains_nanFunction · 0.72

test_error_contains_non_dummiesFunction · 0.72

test_error_with_prefix_multiple_separatorsFunction · 0.72

test_error_with_prefix_sep_wrong_typeFunction · 0.72

test_error_with_prefix_contains_unassignedFunction · 0.72

test_error_with_prefix_default_category_wrong_typeFunction · 0.72

test_error_with_prefix_default_category_dict_not_completeFunction · 0.72

test_error_with_prefix_contains_nanFunction · 0.72