Create a categorical ``DataFrame`` from a ``DataFrame`` of dummy variables. Inverts the operation performed by :func:`~pandas.get_dummies`. Parameters ---------- data : DataFrame Data which contains dummy-coded variables in form of integer columns of 1's and 0'
(
data: DataFrame,
sep: None | str = None,
default_category: None | Hashable | dict[str, Hashable] = None,
)
| 370 | |
| 371 | @set_module("pandas") |
| 372 | def from_dummies( |
| 373 | data: DataFrame, |
| 374 | sep: None | str = None, |
| 375 | default_category: None | Hashable | dict[str, Hashable] = None, |
| 376 | ) -> DataFrame: |
| 377 | """ |
| 378 | Create a categorical ``DataFrame`` from a ``DataFrame`` of dummy variables. |
| 379 | |
| 380 | Inverts the operation performed by :func:`~pandas.get_dummies`. |
| 381 | |
| 382 | Parameters |
| 383 | ---------- |
| 384 | data : DataFrame |
| 385 | Data which contains dummy-coded variables in form of integer columns of |
| 386 | 1's and 0's. |
| 387 | sep : str, default None |
| 388 | Separator used in the column names of the dummy categories they are |
| 389 | character indicating the separation of the categorical names from the prefixes. |
| 390 | For example, if your column names are 'prefix_A' and 'prefix_B', |
| 391 | you can strip the underscore by specifying sep='_'. |
| 392 | default_category : None, Hashable or dict of Hashables, default None |
| 393 | The default category is the implied category when a value has none of the |
| 394 | listed categories specified with a one, i.e. if all dummies in a row are |
| 395 | zero. Can be a single value for all variables or a dict directly mapping |
| 396 | the default categories to a prefix of a variable. The default category |
| 397 | will be coerced to the dtype of ``data.columns`` if such coercion is |
| 398 | lossless, and will raise otherwise. |
| 399 | |
| 400 | Returns |
| 401 | ------- |
| 402 | DataFrame |
| 403 | Categorical data decoded from the dummy input-data. |
| 404 | |
| 405 | Raises |
| 406 | ------ |
| 407 | ValueError |
| 408 | * When the input ``DataFrame`` ``data`` contains NA values. |
| 409 | * When the input ``DataFrame`` ``data`` contains column names with separators |
| 410 | that do not match the separator specified with ``sep``. |
| 411 | * When a ``dict`` passed to ``default_category`` does not include an implied |
| 412 | category for each prefix. |
| 413 | * When a value in ``data`` has more than one category assigned to it. |
| 414 | * When ``default_category=None`` and a value in ``data`` has no category |
| 415 | assigned to it. |
| 416 | TypeError |
| 417 | * When the input ``data`` is not of type ``DataFrame``. |
| 418 | * When the input ``DataFrame`` ``data`` contains non-dummy data. |
| 419 | * When the passed ``sep`` is of a wrong data type. |
| 420 | * When the passed ``default_category`` is of a wrong data type. |
| 421 | |
| 422 | See Also |
| 423 | -------- |
| 424 | :func:`~pandas.get_dummies` : Convert ``Series`` or ``DataFrame`` to dummy codes. |
| 425 | :class:`~pandas.Categorical` : Represent a categorical variable in classic. |
| 426 | |
| 427 | Notes |
| 428 | ----- |
| 429 | The columns of the passed dummy data should only include 1's and 0's, |