Factorize an input `values` into `categories` and `codes`. Preserves categorical dtype in `categories`. Parameters ---------- values : list-like Returns ------- codes : ndarray categories : Index If `values` has a categorical dtype, then `categories` is
(values)
| 3112 | |
| 3113 | |
| 3114 | def factorize_from_iterable(values) -> tuple[np.ndarray, Index]: |
| 3115 | """ |
| 3116 | Factorize an input `values` into `categories` and `codes`. Preserves |
| 3117 | categorical dtype in `categories`. |
| 3118 | |
| 3119 | Parameters |
| 3120 | ---------- |
| 3121 | values : list-like |
| 3122 | |
| 3123 | Returns |
| 3124 | ------- |
| 3125 | codes : ndarray |
| 3126 | categories : Index |
| 3127 | If `values` has a categorical dtype, then `categories` is |
| 3128 | a CategoricalIndex keeping the categories and order of `values`. |
| 3129 | """ |
| 3130 | from pandas import CategoricalIndex |
| 3131 | |
| 3132 | if not is_list_like(values): |
| 3133 | raise TypeError("Input must be list-like") |
| 3134 | |
| 3135 | categories: Index |
| 3136 | |
| 3137 | vdtype = getattr(values, "dtype", None) |
| 3138 | if isinstance(vdtype, CategoricalDtype): |
| 3139 | values = extract_array(values) |
| 3140 | # The Categorical we want to build has the same categories |
| 3141 | # as values but its codes are by def [0, ..., len(n_categories) - 1] |
| 3142 | cat_codes = np.arange(len(values.categories), dtype=values.codes.dtype) |
| 3143 | cat = Categorical.from_codes(cat_codes, dtype=values.dtype, validate=False) |
| 3144 | |
| 3145 | categories = CategoricalIndex(cat) |
| 3146 | codes = values.codes |
| 3147 | else: |
| 3148 | # The value of ordered is irrelevant since we don't use cat as such, |
| 3149 | # but only the resulting categories, the order of which is independent |
| 3150 | # from ordered. Set ordered to False as default. See GH #15457 |
| 3151 | cat = Categorical(values, ordered=False) |
| 3152 | categories = cat.categories |
| 3153 | codes = cat.codes |
| 3154 | return codes, categories |
| 3155 | |
| 3156 | |
| 3157 | def factorize_from_iterables(iterables) -> tuple[list[np.ndarray], list[Index]]: |
no test coverage detected