Distributions
Another way to plot the distribution of values for a numerical variable is to use a distribution plot. This is essentially a histogram “in disguise”: instead of drawing the bars corresponding to each bin, this kind of plot draws a curve that passes through the tips of each bar. You can create a distribution plot using the Plot.Create.Distribution
method:
This method has a single overload, accepting a first parameter of type IReadOnlyList<double>
, which represents the samples whose distribution should be plotted. It also has a number of optional parameters that can be used to determine the appearance of some elements of the plot. Many of these are in common with other kinds of plots and are described in the page about scatter plots; here are the ones specific to this method:
bool vertical
: if this istrue
(the default), the base of the distribution lies on theX
axis. If this isfalse
, the base of the distribution lies on theY
axis.int binCount
: this parameter determines the number of bins used to plot the distribution (corresponding to the number of bars in a histogram). If this is ≤ 1 (the default is-1
), the number of bins to use is determined automatically using the Freedman-Diaconis rule.bool smooth
: if this isfalse
(the default), the distribution is plotted using straight line segments that pass through each point. If this istrue
, a smooth spline that passes through each point is used instead.PlotElementPresentationAttributes dataPresentationAttributes
: this determines the appearance (stroke, fill, etc) of the distribution.
Note that the plot produced by this method is NOT a kernel density estimation.
Plotting multiple distributions
If you wish to compare multiple distributions, it may be useful to plot all of them at the same time. This can be achieved with the Plot.Create.StackedDistribution
method:
The first parameter for this method is an IReadOnlyList<IReadOnlyList<double>>
object; each element of this array should be an array containing the values for one of the distributions. This method has a few parameters in common with the Distribution
method, i.e. bool vertical
, int binCount
(if this is ≤ 1, the number of bins is determined independently for each distribution), bool smooth
. Here, dataPresentationAttributes
is an IReadOnlyList<PlotElementPresentationAttributes>
, where each element determines the appearance of one distribution (if there are more distributions than element in this array, the values are wrapped).
Furthermore, this method has a parameter called normalisationMode
, which determines what kind of scaling is applied to the distributions:
- If this is
Plot.NormalisationMode.None
, the raw bin counts are used. This will cause distributions built with more samples to be taller. - If this is
Plot.NormalisationMode.Maximum
, the curves are normalised so that the maxima of all curves have the same height. - If this is
Plot.NormalisationMode.Area
(the default), the curves are normalised so that they all have the same area.