MCPcopy
hub / github.com/pandas-dev/pandas / qcut

Function qcut

pandas/core/reshape/tile.py:296–393  ·  view source on GitHub ↗

Quantile-based discretization function. Discretize variable into equal-sized buckets based on rank or based on sample quantiles. For example 1000 values for 10 quantiles would produce a Categorical object indicating quantile membership for each data point. Parameters -----

(
    x,
    q,
    labels=None,
    retbins: bool = False,
    precision: int = 3,
    duplicates: str = "raise",
)

Source from the content-addressed store, hash-verified

294
295@set_module("pandas")
296def qcut(
297 x,
298 q,
299 labels=None,
300 retbins: bool = False,
301 precision: int = 3,
302 duplicates: str = "raise",
303):
304 """
305 Quantile-based discretization function.
306
307 Discretize variable into equal-sized buckets based on rank or based
308 on sample quantiles. For example 1000 values for 10 quantiles would
309 produce a Categorical object indicating quantile membership for each data point.
310
311 Parameters
312 ----------
313 x : 1d ndarray or Series
314 Input Numpy array or pandas Series object to be discretized.
315 q : int or list-like of float
316 Number of quantiles. 10 for deciles, 4 for quartiles, etc. Alternately
317 array of quantiles, e.g. [0, .25, .5, .75, 1.] for quartiles.
318 labels : array or False, default None
319 Used as labels for the resulting bins. Must be of the same length as
320 the resulting bins. If False, return only integer indicators of the
321 bins. If True, raises an error.
322 retbins : bool, optional
323 Whether to return the (bins, labels) or not. Can be useful if bins
324 is given as a scalar.
325 precision : int, optional
326 The precision at which to store and display the bins labels.
327 duplicates : {default 'raise', 'drop'}, optional
328 If bin edges are not unique, raise ValueError or drop non-uniques.
329
330 Returns
331 -------
332 out : Categorical or Series or array of integers if labels is False
333 The return type (Categorical or Series) depends on the input: a Series
334 of type category if input is a Series else Categorical. Bins are
335 represented as categories when categorical data is returned.
336 bins : ndarray of floats
337 Returned only if `retbins` is True.
338
339 See Also
340 --------
341 cut : Bin values into discrete intervals.
342 Series.quantile : Return value at the given quantile.
343
344 Notes
345 -----
346 Out of bounds values will be NA in the resulting Categorical object
347
348 Examples
349 --------
350 >>> pd.qcut(range(5), 4)
351 ... # doctest: +ELLIPSIS
352 [(-0.001, 1.0], (-0.001, 1.0], (1.0, 2.0], (2.0, 3.0], (3.0, 4.0]]
353 Categories (4, interval[float64, right]): [(-0.001, 1.0] < (1.0, 2.0] ...

Callers 15

test_qcutFunction · 0.90
test_qcut_boundsFunction · 0.90
test_qcut_all_bins_sameFunction · 0.90
test_qcut_include_lowestFunction · 0.90
test_qcut_nasFunction · 0.90
test_qcut_indexFunction · 0.90
test_qcut_binning_issuesFunction · 0.90

Calls 10

IndexClass · 0.90
_preprocess_for_cutFunction · 0.85
_coerce_to_typeFunction · 0.85
_bins_to_cutsFunction · 0.85
_postprocess_for_cutFunction · 0.85
_call_with_funcMethod · 0.80
putmaskMethod · 0.45
quantileMethod · 0.45
dropnaMethod · 0.45
to_seriesMethod · 0.45

Tested by 15

test_qcutFunction · 0.72
test_qcut_boundsFunction · 0.72
test_qcut_all_bins_sameFunction · 0.72
test_qcut_include_lowestFunction · 0.72
test_qcut_nasFunction · 0.72
test_qcut_indexFunction · 0.72
test_qcut_binning_issuesFunction · 0.72