# Automatically Selecting Histogram Bins

Automatically Selecting Histogram Bins

Choosing the bin sizes for a histogram can be surprisingly tricky. If there are too few bins, it is hard to pick out the underlying distribution of the data. If there are too many bins, the result is either unpleasant to look at because the bins have deteriorated into sticks or noise in the data is not sufficiently averaged out, also making it hard to see the underlying distribution. Here we present several methods for selecting (uniform-width) bins for a histogram.

Fixed number of bins: always use the same number of bins, regardless of the data.

Sturges: the number of bins grows with the log of the size of the data.

Scott: the bin width is proportional to the standard deviation of the values divided by the cube root of the size of the data.

Freedman–Diaconis: the bin width is proportional to the interquartile range of the data divided by the cube root of the size of the data.

Wand: the bin width is chosen to minimize the mean integrated squared error.