r""" Unpivot a DataFrame from wide to long format. Less flexible but more user-friendly than melt. With stubnames ['A', 'B'], this function expects to find one or more group of columns with format A-suffix1, A-suffix2,..., B-suffix1, B-suffix2,... You specify what you want
(
df: DataFrame, stubnames, i, j, sep: str = "", suffix: str = r"\d+"
)
| 367 | |
| 368 | @set_module("pandas") |
| 369 | def wide_to_long( |
| 370 | df: DataFrame, stubnames, i, j, sep: str = "", suffix: str = r"\d+" |
| 371 | ) -> DataFrame: |
| 372 | r""" |
| 373 | Unpivot a DataFrame from wide to long format. |
| 374 | |
| 375 | Less flexible but more user-friendly than melt. |
| 376 | |
| 377 | With stubnames ['A', 'B'], this function expects to find one or more |
| 378 | group of columns with format |
| 379 | A-suffix1, A-suffix2,..., B-suffix1, B-suffix2,... |
| 380 | You specify what you want to call this suffix in the resulting long format |
| 381 | with `j` (for example `j='year'`) |
| 382 | |
| 383 | Each row of these wide variables are assumed to be uniquely identified by |
| 384 | `i` (can be a single column name or a list of column names) |
| 385 | |
| 386 | All remaining variables in the data frame are left intact. |
| 387 | |
| 388 | Parameters |
| 389 | ---------- |
| 390 | df : DataFrame |
| 391 | The wide-format DataFrame. |
| 392 | stubnames : str or list-like |
| 393 | The stub name(s). The wide format variables are assumed to |
| 394 | start with the stub names. |
| 395 | i : str or list-like |
| 396 | Column(s) to use as id variable(s). |
| 397 | j : str |
| 398 | The name of the sub-observation variable. What you wish to name your |
| 399 | suffix in the long format. |
| 400 | sep : str, default "" |
| 401 | A character indicating the separation of the variable names |
| 402 | in the wide format, to be stripped from the names in the long format. |
| 403 | For example, if your column names are A-suffix1, A-suffix2, you |
| 404 | can strip the hyphen by specifying `sep='-'`. |
| 405 | suffix : str, default '\\d+' |
| 406 | A regular expression capturing the wanted suffixes. '\\d+' captures |
| 407 | numeric suffixes. Suffixes with no numbers could be specified with the |
| 408 | negated character class '\\D+'. You can also further disambiguate |
| 409 | suffixes, for example, if your wide variables are of the form A-one, |
| 410 | B-two,.., and you have an unrelated column A-rating, you can ignore the |
| 411 | last one by specifying `suffix='(!?one|two)'`. When all suffixes are |
| 412 | numeric, they are cast to int64/float64. |
| 413 | |
| 414 | Returns |
| 415 | ------- |
| 416 | DataFrame |
| 417 | A DataFrame that contains each stub name as a variable, with new index |
| 418 | (i, j). |
| 419 | |
| 420 | See Also |
| 421 | -------- |
| 422 | melt : Unpivot a DataFrame from wide to long format, optionally leaving |
| 423 | identifiers set. |
| 424 | pivot : Create a spreadsheet-style pivot table as a DataFrame. |
| 425 | DataFrame.pivot : Pivot without aggregation that can handle |
| 426 | non-numeric data. |