Create new MultiIndex from current that removes unused levels. Unused level(s) means levels that are not expressed in the labels. The resulting MultiIndex will have the same outward appearance, meaning the same .values and ordering. It will also be .equals()
(self)
| 2310 | ) |
| 2311 | |
| 2312 | def remove_unused_levels(self) -> MultiIndex: |
| 2313 | """ |
| 2314 | Create new MultiIndex from current that removes unused levels. |
| 2315 | |
| 2316 | Unused level(s) means levels that are not expressed in the |
| 2317 | labels. The resulting MultiIndex will have the same outward |
| 2318 | appearance, meaning the same .values and ordering. It will |
| 2319 | also be .equals() to the original. |
| 2320 | |
| 2321 | The `remove_unused_levels` method is useful in cases where you have a |
| 2322 | MultiIndex with hierarchical levels, but some of these levels are no |
| 2323 | longer needed due to filtering or subsetting operations. By removing |
| 2324 | the unused levels, the resulting MultiIndex becomes more compact and |
| 2325 | efficient, which can improve performance in subsequent operations. |
| 2326 | |
| 2327 | Returns |
| 2328 | ------- |
| 2329 | MultiIndex |
| 2330 | A new MultiIndex with unused levels removed. |
| 2331 | |
| 2332 | See Also |
| 2333 | -------- |
| 2334 | MultiIndex.droplevel : Remove specified levels from a MultiIndex. |
| 2335 | MultiIndex.reorder_levels : Rearrange levels of a MultiIndex. |
| 2336 | MultiIndex.set_levels : Set new levels on a MultiIndex. |
| 2337 | |
| 2338 | Examples |
| 2339 | -------- |
| 2340 | >>> mi = pd.MultiIndex.from_product([range(2), list("ab")]) |
| 2341 | >>> mi |
| 2342 | MultiIndex([(0, 'a'), |
| 2343 | (0, 'b'), |
| 2344 | (1, 'a'), |
| 2345 | (1, 'b')], |
| 2346 | ) |
| 2347 | |
| 2348 | >>> mi[2:] |
| 2349 | MultiIndex([(1, 'a'), |
| 2350 | (1, 'b')], |
| 2351 | ) |
| 2352 | |
| 2353 | The 0 from the first level is not represented |
| 2354 | and can be removed |
| 2355 | |
| 2356 | >>> mi2 = mi[2:].remove_unused_levels() |
| 2357 | >>> mi2.levels |
| 2358 | FrozenList([[1], ['a', 'b']]) |
| 2359 | """ |
| 2360 | new_levels = [] |
| 2361 | new_codes = [] |
| 2362 | |
| 2363 | changed = False |
| 2364 | for lev, level_codes in zip(self.levels, self.codes, strict=True): |
| 2365 | # Since few levels are typically unused, bincount() is more |
| 2366 | # efficient than unique() - however it only accepts positive values |
| 2367 | # (and drops order): |
| 2368 | uniques = np.where(np.bincount(level_codes + 1) > 0)[0] - 1 |
| 2369 | has_na = int(len(uniques) and (uniques[0] == -1)) |