Quantile-based discretization function. Discretize variable into equal-sized buckets based on rank or based on sample quantiles. For example 1000 values for 10 quantiles would produce a Categorical object indicating quantile membership for each data point. Parameters -----
(
x,
q,
labels=None,
retbins: bool = False,
precision: int = 3,
duplicates: str = "raise",
)
| 294 | |
| 295 | @set_module("pandas") |
| 296 | def qcut( |
| 297 | x, |
| 298 | q, |
| 299 | labels=None, |
| 300 | retbins: bool = False, |
| 301 | precision: int = 3, |
| 302 | duplicates: str = "raise", |
| 303 | ): |
| 304 | """ |
| 305 | Quantile-based discretization function. |
| 306 | |
| 307 | Discretize variable into equal-sized buckets based on rank or based |
| 308 | on sample quantiles. For example 1000 values for 10 quantiles would |
| 309 | produce a Categorical object indicating quantile membership for each data point. |
| 310 | |
| 311 | Parameters |
| 312 | ---------- |
| 313 | x : 1d ndarray or Series |
| 314 | Input Numpy array or pandas Series object to be discretized. |
| 315 | q : int or list-like of float |
| 316 | Number of quantiles. 10 for deciles, 4 for quartiles, etc. Alternately |
| 317 | array of quantiles, e.g. [0, .25, .5, .75, 1.] for quartiles. |
| 318 | labels : array or False, default None |
| 319 | Used as labels for the resulting bins. Must be of the same length as |
| 320 | the resulting bins. If False, return only integer indicators of the |
| 321 | bins. If True, raises an error. |
| 322 | retbins : bool, optional |
| 323 | Whether to return the (bins, labels) or not. Can be useful if bins |
| 324 | is given as a scalar. |
| 325 | precision : int, optional |
| 326 | The precision at which to store and display the bins labels. |
| 327 | duplicates : {default 'raise', 'drop'}, optional |
| 328 | If bin edges are not unique, raise ValueError or drop non-uniques. |
| 329 | |
| 330 | Returns |
| 331 | ------- |
| 332 | out : Categorical or Series or array of integers if labels is False |
| 333 | The return type (Categorical or Series) depends on the input: a Series |
| 334 | of type category if input is a Series else Categorical. Bins are |
| 335 | represented as categories when categorical data is returned. |
| 336 | bins : ndarray of floats |
| 337 | Returned only if `retbins` is True. |
| 338 | |
| 339 | See Also |
| 340 | -------- |
| 341 | cut : Bin values into discrete intervals. |
| 342 | Series.quantile : Return value at the given quantile. |
| 343 | |
| 344 | Notes |
| 345 | ----- |
| 346 | Out of bounds values will be NA in the resulting Categorical object |
| 347 | |
| 348 | Examples |
| 349 | -------- |
| 350 | >>> pd.qcut(range(5), 4) |
| 351 | ... # doctest: +ELLIPSIS |
| 352 | [(-0.001, 1.0], (-0.001, 1.0], (1.0, 2.0], (2.0, 3.0], (3.0, 4.0]] |
| 353 | Categories (4, interval[float64, right]): [(-0.001, 1.0] < (1.0, 2.0] ... |