Real Computer Science begins where we almost stop reading ...: C / C++ Programming Pointers

Tuesday, 1 October 2013

C / C++ Programming Pointers

Addresses, the Address Operator, and printing Addresses

All information accessible to a running computer program must be stored somewhere in the computer's memory. ( RAM chips. )
Particular locations in memory are identified by their address.
Memory addresses on most modern computers are either 32-bit or 64-bit unsigned integers, though this may vary with particular computer architectures. ( Ignoring segmentation and paging and other low-level memory details beyond the scope of this course. )
Those memory addresses can be thought of as "house numbers", in a very long linear city where everyone lives on the same street.
The address of a variable can be determined by applying the address operator, &.
Addresses can be printed using the %p format specifier
So for example, the code:

			int i = 42;
			printf( "The variable i has value %d, and is located at 0x%p\n", i, &i );

might produce the following result:

			The variable i has value 42, and is located at 0x0022FF44

Pointer Variables

Declaring Pointer Variables

Since addresses are numeric values, they can be stored in variables, just like any other kind of data.

Pointer variables must specify what kind of data they point to, i.e. the type of data for which they hold the address. This becomes very important when the pointer variables are used.

When declaring variables, the asterisk, *, indicates that a particular variable is a pointer type as opposed to a basic type.

So for example, in the following declaration:
	int i, j, *iptr, k, *numPtr, *next;
variables i, j,and k are of type ( int ), and variables iptr, numPtr, and next are of type ( pointer to int ). Note that regular ints and int pointers can be mixed on a single declaration line.
void Pointers

Normally pointers should only hold addresses of the types of data that they are declared to point to. I.e. an int pointer ( int * ) should hold the address of an int, and a double pointer ( double * ) should hold the address of a double.

The one special exception is the void pointer, ( void * ), which can hold any kind of address. In practice void pointers must be typecast to some kind of a regular pointer type before they can be used.

void pointers can sometimes be useful for making functions more general-purpose, and less tied to specific data types, and will be covered in further detail later.

Initializing Pointer Variables
Pointer variables can be initialized at the time that they are declared, just like any other variable. This is normally done using the address operator, &, applied to any previously declared variable ( that is defined in the current scope. )
For example:
int iGlobal = 0;
int main( void ) {
int i = 1;
int j = 2, * iGlobalptr = & iGlobal, *iptr = &i, *jptr = &j;
int *illegal = & number; // This line causes a compiler error
int number = 42;
. . .
} // main
NULL Pointers

Uninitilized pointers start out with random unknown values, just like any other variable type.

Accidentally using a pointer containing a random address is one of the most common errors encountered when using pointers, and potentially one of the hardest to diagnose, since the errors encountered are generally not repeatable.

Therefore, POINTER VARIABLES SHOULD ALWAYS, ALWAYS, ALWAYS BE INITIALIZED WHEN THEY ARE DECLARED.

If you don't have any better value to use to initialize your variables, use the special macro NULL, ( which is essentially a zero formatted as an address.)

NULL pointers are safer than uninitialized pointers because the computer will stop the program immediately if you try to use one, as opposed to blindly letting you follow an uninitialized pointer off to la la land.

Example:
	double d, *dptr = NULL, e;
Assigning Pointer Variables
Pointers can be assigned values as the program runs, just like any other variable type. For example:
int i = 42, j = 100;
int *ptr1 = NULL, *ptr2 = NULL;     // ptrs initially point nowhere.

ptr1 = &i;                          // Now ptr1 points to i.

ptr2 = ptr1;                        // Now both pointers point to i.  Note no & operator

ptr1 = &j;                          // ptr1 changed to point to j.

ptr2 = NULL;                        // ptr2 back to pointing nowhere.     
Using Pointer Variables - The Indirection Operator

As discussed above, the asterisk,*, in a variable declaration statement indicates that a particular variable is a pointer type.

In executable statements, however, the asterisk is the indirection operator, and has a totally different meaning.

In particular,the indirection operator refers to the data pointed to by the pointer, as opposed to the pointer itself. ( I.e. the * in an executable statement says to follow the pointer to its destination.)

For example:
int i, , j;
int *iptr = &i, *jptr = &j;     // These * indicates variables of type ( pointer to int )

i = 42;          // changes i to 42
*iptr = 100;     // changes i from 42 to 100
j = *iptr;       // Now j is 100 also

// Note carefully the distinction between the following two lines:

*jptr = *iptr;  // Equivalent to j = i. jptr and iptr still point to different places
jptr = iptr;    // Makes jptr point to the same location as iptr.  Both now point to i

  

Pointers and Functions

Pointers as Function Arguments - Pass by Pointer / Address

Function arguments can be of any type, including pointer types.

Pointers hold addresses, so pointer function arguments must be passed addresses as their values.

For example, in the following code:
void func( int a, int *bptr ) {

	a = 42;
	*bptr = 42;
	return;
}

int main( void ) {

	int x = 100, y = 100;

	func( x, &y );

	printf( "x = %d, y = %d\n", x, y );

	return 0;
}
The call to the function passes the value of x and the address of y, which are used to initialize the variables a and bptr inthe function. This is roughly equivalent to:

a = x;

bptr = &y
Since bptr now holds the address of a variable that is stored in main, when the function changes *bptr, it will "follow the pointer" to the variable y stored in main, and change its value from 100 to 42. ( x is left unchanged at 100. )

Preventing Changes with const

Declaring data as const prevents it from being changed, even through a pointer variable. For example, the following function would not be allowed to change the data pointed to by ptr, ( though it could change ptr to make it point somewhere else. )
	double func( const int *ptr );
The pointer itself can also be declared const. In the following example the function could change the data pointed to by the pointer, but it could not make the pointer point anywhere else:
	double func(  int * const ptr );
And if necessary, both the data and the pointer to it can be held constant:
	double func( const int * const ptr );
Returning Addresses from Functions
Functions can return addresses by declaring a pointer type as the return type,

BUT don't forget that ordinary ( auto ) local variables cease to exist when the function goes out of scope.

Therefore functions can only return pointers to things that will continue to remain in existence when the function ends.
For example, in the following code, the function could return the address of A,C[ i ], or E, but not B or D:
int A;     // global

int * ptrFunction( int B, int C[ ] ) {

     int D;            // auto

     static int E;     // static

     . . .
}
Pointers to Functions

Technically functions are stored in memory too, and therefore have addresses that can be pointed to. That is a more advanced topic that will be covered later.

Pointers and Arrays

Pointers to Array Elements

A pointer may be made to point to an element of an array by use of the address operator:
	int nums[ 10 ], iptr = NULL;

	iptr = & nums[ 3 ];		// iptr now points to the fourth element

	*iptr = 42;				// Same as nums[ 3 ] = 42;
Pointer Arithmetic

There are a few very limite mathematical operations that may be performed on address data types, i.e. pointer variables.

Most commonly, integers may be added to or subtracted from addresses.

Note that increment and decrement operations are really just special cases of addition and subtraction.

Arithmetic operations on addresses actually occur in steps of the size of the thing pointed to by the address.

In other words, if a pointer is incremented, it is actually increased sufficiently to point to the next adjacent "thing" in memory, where "thing" corresponds to the type of data that the pointer is declared as pointing to.

So in the example above,

the statement "iptr++;" would cause iptr to now point to nums[ 4 ] instead of nums[ 3 ].

If this were now followed by "iptr += 4;", then iptr would be increased to point to nums[ 8 ].

Another way of looking at this is that if iptr originally held the address 0x1000, and integers on this machine are 4 bytes long, then

iptr++ changes the value in iptr from 0x1000 to 0x1004

iptr += 2 starting from an initial address of 0x1000 would change iptr to 0x1008.

Subtraction of integers from addresses works similarly.

The only other arithmetic operation that is allowed is the subtraction of two addresses. In this case, the result is the number of things between the two addresses, where "things" depends on what data type the addresses point to.

So if we again assume that ints are 4 bytes each, and we have int * variables jptr and iptr holding addresses of 0x1008 and 0x1000 respectively, then jptr - iptr would yield an answer of 2, because the addresses are separated by the size of two ints.

Interchangeability of Pointers and Arrays

Because of the pointer arithmetic works, and knowing that the name of an array used without subscripts is actually the address where the beginning of the arrays is located, and assuming the following declarations:

int nums[ 10 ], *iptr = nums;

Then the following statements are equivalent:

nums[ 3 ] = 42;

*( iptr + 3 ) = 42;

What may come as more of a surprise is that the following two statements are also legal, and equivalent to the first two:

iptr[ 3 ] = 42;

*( nums + 3 ) = 42;

Basically since nums and iptr are both addresses of where ints are stored, the computer treats them identically when interpreting the array element operator, [ ], and the dereference operator, *. ( The compiler will generate the exact same machine instructions for all four of the lines given above. ) The only difference is that nums is a fixed address determined by the compiler, that cannot be changed while the program is running, whereas iptr is a variable, that can be changed to point to other locations. ( iptr refers to a memory location on the stack that holds an address,whereas nums is a constant inserted into the instructions. )

Looping Through Arrays Using Pointers
The following code will print the characters in the array passed to the function, until a null byte is found, and will then print a new line character. ( It will make more sense after character arrays are covered. )
    
		void printString( const char array[ ] ) { // Same as printString( const char * array )

			char *p = array;

			while( *p )
				printf( "%c", *p++ );
			printf( "\n" );

			return;
		}
Combinations of * and ++

*p++ accesses the thing pointed to by p and increments p

(*p)++ accesses the thing pointed to by p and increments the thing pointed to by p

*++p increments p first, and then accesses the thing pointed to by p

++*p increments the thing pointed to by p first, and then uses it in a larger expression.

Pointers and Multidimensional Arrays

Given the declaration:
int matrix[ NROWS ][ NCOLS ], * iptr = matrix;
then the following are also equivalent:

matrix[ i ][ j ] = 42;

*( iptr + i * NCOLS + j ) = 42;

This is why functions receiving two-dimensional arrays as input need to know how many columns are in the array, but don't care how many rows are present.

Arrays of Pointers

Arrays can hold any data type, including pointers. So the declaration:

int * ipointers[ 10 ];

would create an array of 10 pointers, each of which points to an int.

Real Computer Science begins where we almost stop reading ...

Tuesday, 1 October 2013

C / C++ Programming Pointers

Addresses, the Address Operator, and printing Addresses

Pointer Variables

Declaring Pointer Variables

void Pointers

Initializing Pointer Variables

NULL Pointers

Assigning Pointer Variables

Using Pointer Variables - The Indirection Operator

Pointers and Functions

Pointers as Function Arguments - Pass by Pointer / Address

Preventing Changes with const

Returning Addresses from Functions

Pointers to Functions

Pointers and Arrays

Pointers to Array Elements

Pointer Arithmetic

Interchangeability of Pointers and Arrays

Looping Through Arrays Using Pointers

Combinations of * and ++

Pointers and Multidimensional Arrays

Arrays of Pointers

No comments:

Post a Comment