While skewed data distributions (e.g., a large proportion of zeros) invite an array of well-known methodological complications, they also complicated data presentation needs. For example, a standard histogram for a distribution that includes approx. 85% zeros looks something like the following:
Setting aside the methodological complications imposed by such skewed data, as it narrowly relates to descriptive data presentation needs one "dissenting" approach is to insert axis "breaks" (discussion here) that generates the following histogram:
Such a strategy may provide the optimal tradeoff between the visualization of "extreme" data and compression of the non-extreme data. This approach also has the virtue of keeping the underlying data in their raw rather than a transformed scale.
Comments