About this Guide

This guide is supposed to work as a brief "online help" for R via the world wide web. Please note that it does not introduce you to the basics of doing statistical analysis; it is assumed that you have acquired such knowledge elsewhere. Likewise, it is assumed that you have some proficiency in working with computers, particularly under Windows.

This is my youngest project and therefore it is very, very incomplete. As always, it serves basically as a 'hard copy' of my own very fragile memory. Still, you may find something of interest for you.

Who might profit from this guide?

The brief answer is: Hopefully anybody wanting to learn R – but particularly social scientists (including students). You should also have a basic command of statistics, and those who already have made some acquaintance with statistics software will most likely profit more than others.

A somewhat longer answer: I am a social scientist working with 'quantitative' data that can be analyzed with the help of statistical procedures. The majority of students and colleagues who share this endeavour use software that is tailored to the needs of social scientists, such as IBM-SPSS®, Stata®, SAS®, Minitab®, Systat®, SPlus®, or a variety of other packages. Such software provides an easy access to complex statistical procedures and requires no mathematical or programming proficiency on part of the user. For instance, many statistical models involve the inversion of matrices, but probably the majority of (student) users are not aware of this; they just type a few keywords, and the rest is done by the software.

Actually, many statistical software packages come with a programming language of their own that permits users to write procedures for special purposes, but most users are happy with the ready-made procedures offered by the software, or with additional programmes written by expert users that work just like the ready-mades.

R is somewhat more between a statistical analysis package and a programming language. In fact, the R core group states: "R is a language and environment for statistical computing and graphics." (What is R?). Indeed, many statistical procedures come with the R core package, and tons of others that were written by users can be downloaded and installed. In this respect, R does not differ very much from standard statistical software. But at the same time, its character as a programming language makes itself felt almost always.

This has huge advantages. For instance, most statistical software works on a given data set. R can do this as well, but it does not privilege data sets to the extent that other software does. In R, a data set is one of possibly many 'objects' (among which there may, or may not be, other data sets), and each object that has been defined or loaded is available to do whatever is possible with an object of the respective kind.

This implies that data may be available in different 'formats', which in R are called classes; even though there is a special class called data.frame, it is not necessary for data to be analyzable to be formatted as a data.frame. At the same time, the class of an object is decisive for how to handle it.

Obviously, users have to understand a bit about the general principles of R, even if they want to do 'just some statistics'. What's more, I have found that there are not so many texts (books, websites etc.) around that are really helpful for the average social science user who wants to come to terms with these general principles. Be that as it may, what this guide presents is my own attempt to come to grips with R from the perspective of a social scientist. That is, it presents some of the 'basics' of working with R, and then tries to proceed to the main task of social scientists, i.e., analyzing data. (At the moment, this latter part is under development; you will find only a few entries, and among these some are quite special.) Throughout I will emphasize the use of data.frames, i.e. the class (or 'format') by which social science data will typically be available, but I will also point out other possibilities.

For beginners, my advice is to work through the section "Basics" and then to proceed according to your needs. And lest you say "why did nobody warn me?", here's (or, more exactly: there was on 23 January 2017, and still on 03 May 2025) a piece that describes rather nicely the experience you are likely going to make when really embarking on working with R.

Note of thanks

David Peplow did a great job at creating the design for this (and related) project(s).

History

May/June 2025

As I have retired in 2020, and have not done much data analysis work since (and virtually none with R), this guide has been left as it is for several years. However, I have rewritten and expanded several entries about graphs during the past weeks. Not much else will happen in the future, I'm afraid.

June 2018

Since I started this guide, minor additions or amendments have been made here and there. But there has been no real progress.

Summer/Autumn/Winter 2016

Well, as yet there is not much history. With one exception, most of what you find here (which is very little!) was written in the summer and autumn of 2016 – the time when, for whatever obscure reason, I seriously tried to come to grips with R.

About the Author

This page is a process initiated and maintained by

Prof. Dr. Wolfgang Ludwig-Mayerhofer
University of Siegen
Faculty of Arts and Humanities
D – 57068 Siegen

Homepage at the University of Siegen

Last update: 03 May 2025

About this Guide

Who might profit from this guide?

Note of thanks

Links

R tutorials

History

About the Author