Showing posts with label hist. Show all posts
Showing posts with label hist. Show all posts

Thursday, October 4, 2012

Adding Measures of Central Tendency to Histograms in R

Building on the basic histogram with a density plot, we can add measures of central tendency (in this case, mean and median) and a legend.

Like last time, we'll use the beaver data from the datasets package.

hist(beaver1$temp, # histogram
 col = "peachpuff", # column color
 border = "black", 
 prob = TRUE, # show densities instead of frequencies
 xlim = c(36,38.5),
 ylim = c(0,3),
 xlab = "Temperature",
 main = "Beaver #1")
lines(density(beaver1$temp), # density plot
 lwd = 2, # thickness of line
 col = "chocolate3")

Next we'll add a line for the mean:

abline(v = mean(beaver1$temp),
 col = "royalblue",
 lwd = 2)

And a line for the median:

abline(v = median(beaver1$temp),
 col = "red",
 lwd = 2)

And then we can also add a legend, so it will be easy to tell which line is which.

legend(x = "topright", # location of legend within plot area
 c("Density plot", "Mean", "Median"),
 col = c("chocolate3", "royalblue", "red"),
 lwd = c(2, 2, 2))

All of this together gives us the following graphic:


In this example, the mean and median are very close, as we can see by using median() and mode().

> mean(beaver1$temp)
[1] 36.86219
> median(beaver1$temp)
[1] 36.87

We can do like we did in the previous post and graph beaver1 and beaver2 together by adding a layout line and changing the limits of x and y. The full code for this is available in a gist.

Here's the output from that code:


Thursday, September 27, 2012

Histogram + Density Plot Combo in R

Plotting a histogram using hist from the graphics package is pretty straightforward, but what if you want to view the density plot on top of the histogram? This combination of graphics can help us compare the distributions of groups.

Let's use some of the data included with R in the package datasets. It will help to have two things to compare, so we'll use the beaver data sets, beaver1 and beaver2: the body temperatures of two beavers, taken at 10 minute intervals.

First we want to plot the histogram of one beaver:

hist(beaver1$temp, # histogram
 col="peachpuff", # column color
 border="black",
 prob = TRUE, # show densities instead of frequencies
 xlab = "temp",
 main = "Beaver #1")

Next, we want to add in the density line, using lines:


hist(beaver1$temp, # histogram
 col="peachpuff", # column color
 border="black",
 prob = TRUE, # show densities instead of frequencies
 xlab = "temp",
 main = "Beaver #1")
lines(density(beaver1$temp), # density plot
 lwd = 2, # thickness of line
 col = "chocolate3")


Now let's show the plots for both beavers on the same image. We'll make a histogram and density plot for Beaver #2, wrap the graphs in a layout and png, and change the x-axis to be the same, using xlim.


Here's the final code, also available on gist:

png("beaverhist.png")
layout(matrix(c(1:2), 2, 1,
 byrow = TRUE))
hist(beaver1$temp, # histogram
 col = "peachpuff", # column color
 border = "black",
 prob = TRUE, # show densities instead of frequencies
 xlim = c(36,38.5),
 xlab = "temp",
 main = "Beaver #1")
lines(density(beaver1$temp), # density plot
 lwd = 2, # thickness of line
 col = "chocolate3")
hist(beaver2$temp, # histogram
 col = "peachpuff", # column color
 border = "black",
 prob = TRUE, # show densities instead of frequencies
 xlim = c(36,38.5),
 xlab = "temp",
 main = "Beaver #2")
lines(density(beaver2$temp), # density plot
 lwd = 2, # thickness of line
 col = "chocolate3")
dev.off()