Vectors and Matrices

Vectors and matrices are the basic elements from which statistics are built. You may have come across the notion of a "data matrix", which is more or less the same as a "data set". But actually any collection of items that is ordered by rows and columns may be considered a matrix. The rows of the matrix, in turn, can be considered as vectors, as can the columns. That is, whereas a matrix consists of (usually several) rows and columns (and therefore is a two-dimensional object), a vector is one-dimensional. (As an aside, objects with more than two dimensions are called arrays; they will not be dealt with here.)

In R, matrices are special kinds of objects and constitute a special class (it may be advisable to read the entry about objects before continuing if you are new to R and to this guide). A data matrix can be an R matrix, but more often than not it will belong to a special class called data.frame which will be explained in the following entry. The peculiarity of R matrices and vectors is that the elements they consist of all have to be of the same mode. In other words, they will consist either of numbers, or characters, or logical values (TRUE or FALSE). (They may also consist of complex numbers, which normally are irrelevant for statistics, or of elements of mode "raw", which are something for computer specialists [such elements hold raw bytes].)

The only exception to the rule "all elements of the same mode" is the following: Vectors and matrices may contain elements that indicate that a certain value is not known. These elements are represented as NA (not available), or possibly, in the case of a numeric vector/matrix, as NaN (not a number). I will not deal with NA and NaN in this entry (and also in no other entry, for the time being).

Even if you are a social scientist mainly interested in analyzing data, it can be useful to know a little bit about vectors and matrices (and about other object classes as well). The reason is that you may easily encounter examples on the world wide web or in texts where such knowledge is required for a deeper understanding.

Vectors

A vector, then, is a one-dimensional collection of elements of the same mode. To create a vector from scratch, you may use c() for concatenate, as in:

aaa <- c(1, 2, 5, 6)

Note the commas that separate the values. "aaa" will be a vector consisting of the four elements 1, 2, 5 and 6.

If the list of elements contains at least one alphanumeric character, all elements will be alphanumeric:

aab <- c(1, 2, 5, "a")

will result in the vector "1", "2", "5" and "a". R uses double quotes to make clear that items that look like numbers actually are treated as characters.

Even if the vector consists of numbers only, you may force it to be alphanumeric by putting the numbers between quotes:

aac <- c("1", "2", "5", "6")

Actually, it would be sufficient to put a single number between quotes; the vector will still be alphanumeric. Note that you may also enter single quotes; R, however, will always show you double quotes.

Finally, note that you can use the c() operator to concatenate vectors. For instance,

aac <- c(aaa, aab)

will yield the vector "1" "2" "5" "6" "1" "2" "5" "a".

Some things to know about vectors

1. In mathematical treatises, you will often find that vectors are considered either as row vectors (all items in a single row) or as column vectors (all items in a single column). R offers flexibility in this respect. For instance, if you have a matrix with two columns and four rows plus a vector of two elements, you may consider the vector as a row vector and add it to the bottom (or the top) of the matrix (the command is rbind and will be explained shortly). But the same vector may also be added to a matrix with four columns and two rows, in this case as a fifth column (use cbind to accomplish this).

2. If you encounter an object and need to know whether it is a vector, use is.vector. In the following example,

aad <- 1
is.vector(aad)

the answer will be TRUE. Writing, e.g., av <- is.vector(aad) will create an object called "av" that contains the value TRUE.

3. The example just presented demonstrates that vectors can consist of a single element only. Mathematicians call such an object a scalar, but "scalar" is not an R class. If, for whatever reason, you have to know whether a vector actually is a scalar, you may inquire about this with the help of package assertthat. Thus,

libary(assertthat)
is.scalar(aad)

will yield TRUE. (Of course, you may, and will typically, use something like test <- is.scalar(aad) in a piece of program, as otherwise you might just inspect [i.e., print] "aad".

4. You may inquire about the number of elements in a vector by

length(aaa)

Of course, the result may be stored to an object with al <- length(aaa)

Below, you will learn about the dim command with the help of which you may inquire about the number of dimensions of an object. In the case of matrices, this will be the number of rows and columns. In mathematics, a row vector is a vector of dimension "1, n" with n as the number of elements. But as R does not distinguish between row and column vectors, the answer to an inquiry about the dimension of a vector will be NULL.

Elements of vectors

If you want to know what is the, say, 3rd element of a given vector, say, "aaa", write

aaa[3]

Note the square brackets, from which R will infer that you inquire about an element of an object. In contrast, writing "aaa(3)" will result in an error message.

Matrices

A matrix "from scratch"

You may create a matrix "from scratch" as follows:

v4 <- matrix(c(1,2,3,4),ncol=2)

The first two elements will be used to create the first column, and the other two will be stored in the second column. The same will happen if you write the command as follows:

v4 <- matrix(c(1,2,3,4),nrow=2)

Combining vectors to matrices

Assuming that you have two or more vectors of the same length, you may combine these into a matrix. You must inform R whether the vectors are to be combined as columns or as rows.

Vectors as columns

ba <- cbind(aaa, aab)

will result in a matrix the first column of which contains the elements of "aaa", whereas the second column will contain the elements of "aab". Note that if one of the vectors is alphanumeric (as was the case with "aab" in our example above), the entire matrix will be alphanumeric.

Vectors as rows

bb <- rbind(aaa, aab)

will put the elements of "aaa" into the first row of the matrix and those of "aab" into the second row.

What if the vectors don't have the same length?

As far as I can see, R will still create the matrix, repeating the elements of the shorter vector. However, a warning will be issued.

Transforming a vector into a matrix

A single vector may be transformed into a matrix. Simply writing

ca <- matrix(aaa)

will create a matrix with a single column and, consequently, the number of rows equal to the number of elements of "aaa".

You can influence the number of rows and columns of the matrix as follows (note that "aaa" has four elements):

ca <- matrix(aaa, nrow=2, ncol=2)

This will create a 2 x 2 matrix. Actually, in this example either nrow or ncol could have been omitted, as a vector of four elements and two rows (or two columns) will automatically have two columns (or two rows).

Note that the matrix will be filled column-wise; in our example, the first two elements of "aaa" will end up in the first column of the matrix and the other two in the second. This can be changed as demonstrated in the next example:

ca <- matrix(aaa, nrow=2,byrow=TRUE)

Now, the first two elements of "aaa" will end up in the first row, and so on.

What if the number of elements and the number of rows and columns don't match?

R will fill the matrix by repeating the vector (in R parlance, this is called "recycling"); however, it will issue a warning. For example, the command

v5 <- matrix(c(1,2,3,4,5),nrow=3,ncol=7)

will create the following matrix:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]     1    4    2    5    3    1    4
[2,]     2    5    3    1    4    2    5
[3,]     3    1    4    2    5    3    1

Row and column names

The matrix exhibited in the previous paragraph shows that by default the rows and columns of a matrix are numbered. You may create labels, or names, that can be shown in place of these numbers. Above, we have created matrix "v4" with two rows and two columns. Assuming the the columns represent the "input" and the "output", respectively, you may write

cn <- c("Input","Output")
colnames(v4) <- cn

Now, "Input" and "Output" will appear instead of the the column number [,1] and [,2] if you type v4 to have a look at the matrix.

Elements of a matrix

In what follows, a few examples explain how to refer to single or several elements of a matrix by the row and/or column numbers. Let's assume that we are dealing with a matrix called "x100" that contains 100 rows and 10 columns.

`x100`		lists the complete matrix
`x100[2]`		lists the second element of x100, i.e. the second element in the first column.
`x100[102:104]`		lists the 102^nd to 104^th element of x100, i.e. the second to fourth element in the second column.
`head(x100)`		lists the first five rows or so of x100
`tail(x100)`		lists the last five rows or so of x100
`x100[3,]`		lists the third row of x100
`x100[30:50,]`		lists rows 30 to 50.
`x100[,2]`		lists the second column of x100 as one or several continuous lines of numbers
`x100[,2:3]`		lists the second and third column in column format.
`x100[30:50,2:3]`		lists the second and third column of the rows 30 to 50.

Properties of a matrix

Information about the number of rows and columns of a matrix is requested by

dim(name-of-matrix)

You may wish to store the length (number of rows; first dimension) or the width (number of columns; second dimension) of a matrix into an object. This might be accomplished via

aa <- dim(name-of-matrix)[1]

for the length, and by

bb <- dim(name-of-matrix)[2]

for the width of the matrix.