Kernel Density Plots: Has the histogram had its day?
Simple statistical concepts include the mean, median, standard deviation, and percentiles. These are useful for summarising data. Except these summary statistics are only useful under certain circumstances. When basic assumptions are not met, then any conclusions based on simple summary statistics are likely to be inaccurate. Unable to give a hint as to what is wrong, the numbers can often look perfectly reasonable. Lets consider a sample of 64 reaction time observations (in milliseconds): Mean = 387ms Median = 340ms These look ok, until you view the distribution, which is not unimodal. Despite being a staple in data visualisation, histograms can often be a poor method for determining the shape of a given distribution because they are strongly affected by the number of bins used. For example, visualising the same data with only four bins can make the same observations appear normally distributed. Similarly, a box-plot can also hide data irregulari...


Comments
Post a Comment