Arrays!

See also

Reading Assignment

Text: Chapter 11

Now, we come to an important concept - the Data Structure. Up to now, we have been using small containers to hold fairly simple pieces of data. Things like integers, floats, characters are all pretty small chunks of stuff, and only take a few of our little eight-bit boxes to store in memory. We cannot make very interesting programs if all we can do is play around with simple data like these.

Declaring an Array

To paraphrase the sheriff in Jaws “We need a bigger boat” - er container for our data! Here is how we do it:

int grades[25];

Hmmm. All we have added is some square brackets and a number in between after a name. In this example we have asked the compiler to reserve 25 integer boxes for us and call that entire chunk of memory by the name grades.

This group of memory locations - all 25 of them - is called a data structure to emphasize the fact that we are dealing with a group of containers, each big enough to hold a single integer value. Obviously, we have to work harder to initialize this new kind of container, and we have several ways to do that. If the number of integers is small enough, we can use this approach:

int grades[5] = {100,92,48,87,95};

Here, we have provided a list of values to be used to initialize all 5 integer containers in the array. This initialization would be a real pain for a big array with lots of numbers, but we could do this if we like.

Accessing Array Elements

How do we get at the individual elements of the array? We need a new notation to refer to one single element in the array - that notation uses the square brackets again, and a new thing called an index:

grades[0] = 100;
grades[4] = 95;

Warning

In C++, unlike real life, the first container in a group of containers has an index of zero, not one! So, for our last example, we can refer to each of the five locations in the container named grades by using the index numbers from zero to four!

Far to many mistakes in programming are due to thinking the first index should be one, and the last index should be five!

Looping over the array

Suppose we had a big class and wanted to initialize all the grades to zero (OK, lets make it 100, just to be nicer). We can set up a simple loop to do the job:

int grades[25];
...
for(int i=0;i<25;i=i+1) {
    grades[i] = 100;
}

Here, we see that the thing inside the square brackets is actually an expression that evaluates to a simple integer - one that should be between 0 and 24 for this example. On each pass through the loop, a different element of the grades array gets set to the value 100. When the loop completes, all 25 locations will have been initialized.

Examples of array problems

Let’s consider the problem of setting a letter grade for a numerical grade. Lets set up a problem and solve it using arrays:

int grades[10] = {100,77,65,88,99,90,80,50,69,70};
char letters[5] = {'A','B','C','D','F'};

int main(void) {
    int thisGrade;
    int minGrade;
    char thisLetter;
    for(int i=0;i<10;i=i+1) {
        thisGrade = grades[i];
        minGrade = 90;
        for(int j=0;j<5;j=j+1) {
            thisLetter = letters[j];
            if(thisGrade >= minGrade)
                break;
            minGrade = minGrade - 10;
        }
        cout << "A grade of " << thisGrade << " gets a " << thisLetter << endl;
    }
}

Which gives us this:

A grade of 100 gets a A
A grade of 77 gets a C
A grade of 65 gets a D
A grade of 88 gets a B
A grade of 99 gets a A
A grade of 90 gets a A
A grade of 80 gets a B
A grade of 50 gets a F
A grade of 69 gets a D
A grade of 70 gets a C

Phew, how does this code work. Well, to figure it out, we need to walk it through by hand and see if it does what we want.

What is the overall structure of this code?

We see a list of grades stored in an array, and we see a list of possible letter grades stored in a second array. The grades are arranged in the correct order, and we all know the rules for assigning letter grades to number grades. Grades from 90 to 100 get an A, from 80-89 get a B and so on. The break point starts off at 90 and decreases by 10 for each letter grade.

Hmmm, maybe we can use this fact to figure out the letter grade.

Our program starts off by looping over all the grades in the class - looping over the grades array, examining one grade at a time. For simplicity, I copy the particular grade from the array into another variable. I do not need to do this, but it might make the code easier to understand.

Inside the body of this outer loop, my job is to figure out exactly what letter grade to assign. That takes another loop!

Figuring out the letter grade

While I could have used a giant if statement to figure out the grade, we are going to do it a bit differently.

Since we have five possible letter grades available, we could loop over those grades and search for the one that one applies to the grade we are looking at at the moment. As soon as we can see that we have the right grade assigned, we can bail out of this second loop and print out the answer. But, how can we know we have found the right letter grade.

Lets set up an integer variable called minGrade and set it to the lowest grade that will earn the first letter in our letter grade array.

Anything equal to or above a 90 will get an A.

Now we loop over the five letter grades, using the counter j to track which letter we are examining. If the grade we are checking is bigger than (or equal to) the minimum grade, we have found our letter grade and we can stop looking. The if statement checks this, and uses the break statement to bail out of the inner loop.

If the grade we are examining is less than the current minimum grade value, we have not found the right letter grade yet, so we need to move on to the next letter in the letter array. For this next letter, the minimum grade is 10 points lower than it was for the last letter, so we decrease our minGrade variable by 10 and let the loop try again.

You need to think about what happens if the number grade really gets an F. Is the logic of this code working right? On the last pass through the inner loop, what value does minGrade have? Is 50 really the lowest number grade that gets an F? Actually, no - any number grade we have left when we reach this last pass gets an F. The variable thisGrade will probably still be lower than the minGrade value, but all that means is that we will not break out of this loop on the if-statement. But, since we are on the last pass, we will exit the loop anyway with thisLetter set to an F just like we want.

This is too complicated!

Well, I admit that thinking this logic up might not come naturally to you until you have fought through a few situations like this. What is important here is that you see how the loops work and can walk your way through the code and figure out what is happening.

Note

You learn a lot about programming by studying other peoples code. There is a huge amount of code available on the Internet, and when you want to figure out how to do something, use your best friend; Google to find example code!

If you like, ye can always rewrite this to use the nested if-statements we used earlier to figure out the letter grades - eliminating the inner loop and array of letter grades in the process. Which way is better? You get to decide as long as both work!

Problems with Arrays

What would this do?

int value[5];
int stuff[100];
...
for(int i=0;i<100;i++) {
        value[i] = 10;
}
for (int i=0;i<10;i++) {
    stuff[i] = 1;
}
cout << "value[0] = " << value[0] << endl;
cout << "value[10] = " << value[10] << endl;

Well, the compiler would say this is fine, you must know what you are doing. But do you? What is going on anyway?

value[0] = 1
value[10] = 10

When we refer to the name value in this example, the compiler looks the name up in its internal table and figures out what address got assigned to this big box of integers - lets say it was location 1004, just for the heck of it.

Now, integers are four small boxes big - actually they take 32 bits each in most systems. So the first integer we put in stuff fills up memory locations at 1004, 1005, 1006, and 1007. That corresponds to the first integer in stuff or stuff[0]. The next integer we put in stuff will live at locations 1008, 1009, 1010, and 1011. That container we can access as stuff[1]. Keep this up and we get to stuff[4], the last one we set up in our declaration. This integer lives at locations 1020, 1021, 1022, and 1023. So far, so good.

What lives at location 1024? Based on the way I declared variables, I set up a container named value right after my stuff. The first integer in that new container lives in the next available memory location - you guessed it - at 1024 (and 1025, 1026, and 1027).

But, my loop kept on indexing elements of stuff far past where I have set any container up for that silly stuff! Guess what I am doing in this loop? Killing the contents of value!

This is called over-indexing and array and it is a very bad thing to do (and one of the easiest mistakes in all of programming to make!)

Our mistakes are seldom this obvious, most of the time we zap the first element in the second array by failing to stop our loops right:

int stuff[5];
..
stuff[5] = 0;
...
for(int i=0;i<=5;i=i+1) {
    stuff[i] = 0;
}

This one is just as bad, just not so obvious.

Searching an array

Let’s try another example using arrays. Suppose we wanted to find the biggest value in a bunch of values (like our grades above). Now, we do not know where the biggest number will live. So, we need to look at all the locations and find the biggest one. How do we even start working this out?

Well, we need to loop over all the grades and hunt for the answer!

int grades[10] = {45,77,65,88,99,90,80,50,69,70};
...
for(int i=0;i<10;i=i+1) {
    // is grades[i] biggest?
}

(OK, I changed the first grade to make the hunt a bit more difficult!)

Now, we need to think about our loop. We start off looking at the first grade in the array. Is it the biggest? It is the biggest of all the grades we have looked at so far, so let’s keep that value as the current biggest. If we find anything bigger, we will change our value to that new biggest, and so on. So, here is what we might end up with:

int biggest = grades[0];
for(int i=1;i<10;i=i+1) {
    if(grades[i] > biggest)
        biggest = grades[i];
}
cout << "The biggest value was " << biggest << endl;

Notice that I changed my loop to start with index 1 now, not zero. I also started off storing the value of biggest with the value in grades[0]. Why? Because the first element in the array is the biggest one I have seen so far, I will change my mind later if I see a bigger value in the loop!

The loop looks at each remaining grade in the array and checks to see if any of them is bigger than the biggest I have seen so far. Each time I find a bigger value, I throw away my old value and keep track of this new big guy!

The biggest value was 99

Looks like it worked!

Can we get the smallest at the same time? Sure, use the same approach as before:

int biggest = grades[0];
int smallest = grades[0];
...
for(int i=1;i<10;i=i+1) {
    if(grades[i]>biggest) biggest = grades[i];
    if(grades[i]<smallest) smallest = grades[i];
}
cout << "Big was " << biggest << " and small was " << smallest << endl;
Big was 99 and small was 45

Two-dimensional arrays

If one-dimensional arrays are useful, so are higher dimensional arrays, and C++ supports them as well. The notation is a natural extension of what we have seen. Here is a declaration of a two-dimensional array:

const int NROWS = 10;
const int NCOLS = 15;

int matrix[NROWS][NCOLS];

The most common way to work with these arrays is to use nested for loops:

for(int r=0;r<NROWS;r++)
    for int c=0;c<NCOLS;c++)
        matrix[r][c] = 0;

The first indes counts the number of rows we skip to get into the row we want to access, the second counts the number of columns we want to skip to get to the column we want to access. That is why we always start indices at zero in computing. (It also make calculating the address of a particular container easy.)

Passing a two-dimensional array to a function

We want functions to be general-purpose. That means we want to build functions that can work on any size array. For that reason, when we pass an array to a function, we only need to send soen the address of the array. Since the function has no idea how big the array is in this case, we also send down the dimension of the array:

int getMax(int array[], int nrows) {
    int max = 0;
    for(int r=0;r<nrows;r++)
        if array[r]>max) max = array[r];
    return max;
}

Would this work if the array contained negative numbers?

Two-dimensional arrays as parameters

We can extend the above scheme to two-dimensional arrays easily:

int getMax(int array[][], int nrows, int ncols) {
    int max = 0;
    for(int r=0;c<nrows;r++)
        for(int c=0;c>ncols;c++)
        if array[r][c]>max) max = array[r][c];
    return max;
}

If you wish, you can omit the size parameters, but it is up to you to make sure you do not over-index (or under-index) and array!