Saturday, January 25, 2025

Seeing data using exploratory data analysis and histograms












When I get bored, I play solitare – the online version from Recently I got curious about how much time and how many moves it was taking me to complete a game. What would the averages be? Would their shape just follow a Normal Distribution - the symmetrical bell curve shown above. It has roughly two thirds of the data (68.3%) in a range plus or minus one standard deviation from the mean.  









Or would there instead be a skew distribution, as shown above? I did some exploratory data analysis. I kept track of a hundred games, and then used Microsoft Excel to plot histograms.












A histogram for time to complete is shown above. The mean time to complete was 208 seconds (3 minutes and 28 seconds) and the standard deviation was 27.6 seconds.  Columns in this histogram are twenty seconds wide. There are roughly two and a half to the left of the mean, but four and a half to the right.













Another histogram, for the number of moves to complete, is shown above. The mean number of moves was 134 and the standard deviation was 12 moves. Columns in this histogram are ten moves wide. There are three to the left of the mean, but four to the right.  
















Both histograms have a positive skew. They are shaped like the slide on a children’s playground, as shown above via a cartoon.


No comments:

Post a Comment