- The original value of a variable
- The new value of the variable
- The change between old and new
First, let's load ggplot2 and our data:
library(ggplot2) data <- as.data.frame(USPersonalExpenditure) # data from package datasets data$Category <- as.character(rownames(USPersonalExpenditure)) # this makes things simpler later
Next, we'll set up our plot and axes:
ggplot(data, aes(y = Category)) + labs(x = "Expenditure", y = "Category") +
For geom_segment, we need to provide four variables. (Sometimes two of the four will be the same, like in this case.) x and y provide the start points, and xend and yend provide the endpoints.
In this case, we want to show the change between 1940 and 1960 for each category. Therefore our variables are the following:
- x: "1940"
- y: Category
- xend: "1960"
- yend: Category
geom_segment(aes(x = data$"1940", y = Category, xend = data$"1960", yend = Category), size = 1) +
Next, we want to plot points for the 1940 and 1960 values. We could do the same for the 1945, 1950, and 1955 values, if we wanted to.
Finally, we'll finish up by touching up the legend for the plot:
geom_point(aes(x = data$"1940", color = "1940"), size = 4, shape = 15) + geom_point(aes(x = data$"1960", color = "1960"), size = 4, shape = 15) +
Finally, we'll finish up by touching up the legend for the plot:
scale_color_discrete(name = "Year") + theme(legend.position = "bottom")
geom_segment, then geom_point |
The order of geom_segment and the geom_points matters. The first geom line in the code will get plotted first. Therefore, if you want the points displayed over the segments, put the segments first in the code. Likewise, if you want the segments displayed over the points, put the points first in the code.
For example, we could change the middle section of the code to:
And the output would look like:
geom_point(aes(x = data$"1940", color = "1940"), size = 4, shape = 15) + geom_point(aes(x = data$"1960", color = "1960"), size = 4, shape = 15) +
geom_segment(aes(x = data$"1940", y = Category, xend = data$"1960", yend = Category), size = 1) +
And the output would look like:
geom_point then geom_segment |
Similarly, if you have points that will be overlapping, make sure you think about which of the point lines you want R to plot first.
The code is available in a gist.