Modeling Quantiles
Modeling Quantiles
One standard Data Mining setting is defines by a set of n observations on a variable of interest Y and a set of p explanatory variables, or features, x = (x1,...,xp), with the objective of finding a ‘dependence’ of Y on x. Such dependencies can either be of direct interest by themselves or used in the future to predict a Y given an observed x. This typically leads to a model for a conditional central tendency of Y|x, usually the mean E(Y|x). For example, under appropriate model assumptions, Data Mining based on a least squares loss function (like linear least squares or most regression tree approaches), is as a maximum likelihood approach to estimating the conditional mean. This chapter considers situations when the value of interest is not the conditional mean of a continuous variable, but rather a different property of the conditional distribution P(Y|x), in particular a specific quantile of this distribution. Consider for instance the 0.9th quantile of P(Y|x), which is the function c(x) such that P(Y
CITATION: Perlich, Claudia. Modeling Quantiles edited by Wang, John . Hershey : IGI Global , 2008. Encyclopedia of Data Warehousing and Mining, Second Edition - Available at: https://library.au.int/frmodeling-quantiles