Scatterplots

Scatterplots are obtained of plotting two numeric vectors, which of course may (but need not necessarily) represent two variables in a data.frame. But note that plot is also a 'method'; it may behave very differently if the data do not belong to the data.frame class.

plot(mydata$age , mydata$income)

will plot variable 'age' (from data.frame 'mydata') on the x axis, and the pertinent values of income on the y axis.

With large datasets, scatterplots may become less useful, as often many data points are overlapping. You might wish to try one of the following (note that the two libraries have to be installed if you have not yet done so):

library(ggplot2)
library(hexbin)
ggplot(mydata,aes(x=age,y=income)) + geom_point(alpha = 0.3)
ggplot(mydata,aes(x=age,y=income)) + stat_binhex()

The first ggplot does 'alpha blending', which makes each point somewhat transparent. Overlapping points will appear darker. The second command will produce hexagonal binning, a procedure similar to sunflower plots.

© W. Ludwig-Mayerhofer, R Guide | Last update: 02 Apr 2017