5 common C programming bugs (and how to avoid them)

0

Programming in C can provide a lot of flexibility to writing your own command line programs. I like writing in C because I find it easier to write most of my programs. The trade-off to writing in C is that you need to be more careful. In “higher level” programming languages like Rust, Go, or Java, the language helps to protect the programmer from certain common mistakes. But in C, you need to watch for these pitfalls yourself.

Here are five common C programming bugs, and how you can avoid them:

1. Using variables without initializing them

The C programming language doesn’t initialize variables to zero before you use them; that’s the job of the programmer. What happens “behind the scenes” is that the operating system will give the program some memory to use, but it’s not guaranteed to have any particular value. It’s just random bits. So any values you try to read from that area could be completely random. If you forget this important detail, you can wind up with some very strange behavior.

Here’s a sample program that uses a few integer variables, an integer array, and a pointer to an integer array. Note that the program immediately prints the values from those variables without giving them a value so the values are just some weird bit patterns that were in memory at the time.

#include <stdio.h>

int main()
{
    int a, b, c, d;
    int numbers[5];
    int *array;

    puts("These variables are not initialized:");
    printf("a, b, c, d = %d, %d, %d, %d\n", a, b, c, d);

    puts("This array is not initialized:");

    for (a = 0; a < 5; a++) {
        printf("numbers[%d] = %d\n", a, numbers[a]);
    }

    puts("This array is not allocated or initialized:");

    for (a = 0; a < 5; a++) {
        printf("array[%d] = %d\n", a, array[a]);
    }

    puts("Ok");

    return 0;
}

If we save this as uninit.c and compile it, we can see the program generates some unpredictable values:

$ gcc -o uninit uninit.c
$ ./uninit 
These variables are not initialized:
a, b, c, d = 32767, -154530776, 0, 0
This array is not initialized:
numbers[0] = 0
numbers[1] = 0
numbers[2] = 0
numbers[3] = 0
numbers[4] = 0
This array is not allocated or initialized:
array[0] = -98693133
array[1] = -284203435
array[2] = -443987776
array[3] = -1991682239
array[4] = 1096172031
Ok

Programming tip: If you need a variable to start with some initial value, assign a value to it when you declare the variable. Or add an extra step to store a value (like zero) to the variable before you use it.

2. Going outside array boundaries

When you’re writing programs that involve arrays, it can be tempting to assume the size of the array will never change. Unfortunately, that rarely remains true.

Here’s a typical example: In “version 1” of your program, the array is ten elements long. For “version 2,” you make a few changes, and realize you don’t need the array to be quite as long, maybe only five elements. You update the program – but you forget to make the same change everywhere. Your array is defined as five elements, but later you try to store ten elements. Here’s one example:

#include <stdio.h>

int main()
{
    int i;
    int array[5];
    char a = 'a', b = 'b';
    int zero = 0;

    /* show starting values */

    printf("a, b = %c, %c\n", a, b);
    printf("zero = %d\n", zero);

    /* initialize the array */

    for (i = 0; i < 10; i++) {
        array[i] = 0;
    }

    /* show ending values */

    printf("a, b = %c, %c\n", a, b);
    printf("zero = %d\n", zero);

    puts("Ok");

    return 0;
}

This goes outside the array boundaries, and can lead to unpredictable behavior. That’s because the memory for the array was only five elements; the memory at what would be “element six” is actually some other variable. In the worst case, your program might overwrite some important value and cause the program to hang or crash. When I saved this program as outside.c and compiled and ran it, the program hung and I had to use control-c to terminate the program and return to the command prompt.

$ gcc -o outside outside.c
$ ./outside 
a, b = a, b
zero = 0
^C

Programming tip: Use a constant value using #define and use that for the size of your array. That’s one place to change the program when you later decide the array should be some other size.

3. Overflowing a string

Related to going outside array boundaries is overflowing a string. A string is really just an array of char values, usually terminated by a zero value called the “null terminator.”

One easy way to overflow a string is when the program asks the user to input something. The old-style gets function will get a string value from the terminal, but without consideration to the array size. If your string variable can hold only eight letters, but the user enters nine or more letters, the program will overflow the string. And as with going outside an array’s boundaries, that will overwrite some part of memory elsewhere in the program.

Fortunately, gets has been deprecated in the C programming language, so programs that use it will fail to compile.

So let’s look at another example of overflowing a string. You can copy one string into another using the strcpy function. This assumes that the destination string is large enough to store the full value, which may not always be true; Here’s one example that naively copies a longer string into a string variable that’s not large enough:

#include <stdio.h>
#include <string.h>                    /* strcpy */

int main()
{
    char str[4];
    char hello[] = "Hello world!";

    puts(hello);

    strcpy(str, hello);
    puts(str);

    puts("Ok");

    return 0;
}

If I save this program as string.c and compile it, we’ll see that the program doesn’t just fail – it fails hard with a core dump.

$ gcc -o string string.c
$ ./string
Hello world!
Hello world!
Ok
Segmentation fault (core dumped)

Programming tip: C programmers should always take care to keep track of how long each string is, and make sure there’s enough room to copy data into it.

4. Using invalid file pointers

C makes it easy to read and write data to files. You open a file with the fopen function, and close it with fclose when you’re done with the file. But don’t assume the program was able to open the file. You might encounter any number of reasons that the program couldn’t open a file, from file permissions, file locking, or simply the fact that the file doesn’t exist. Here’s one program that make the too-simple assumption that it could open a file before printing it:

#include <stdio.h>

int main()
{
    FILE *pfile;
    char ch;

    /* open the file */

    pfile = fopen("file.dat", "r");

    /* show the file */

    while ((ch = fgetc(pfile)) != EOF) {
        putchar(ch);
    }

    /* close the file */

    fclose(pfile);

    puts("Ok");
    return 0;
}

The program is supposed to read data from file.dat and print it to the terminal, but this will fail if the file does not exist. And in my case, that file isn’t there. If I save this program as showfile.c and compile it, we can see the file fails hard with a core dump:

$ gcc -o showfile showfile.c
$ ls file.dat
/bin/ls: cannot access 'file.dat': No such file or directory
$ ./showfile
Segmentation fault (core dumped)

Programming tip: Check the return value of fopen to be sure that your program could actually open the file.

5. Freeing memory more than once

The C programming language has a flexible system to allocate memory for a program. malloc can allocate an entire block of memory at once, while the related calloc function allocates memory based on the number of elements that you need to store. If you need to change the amount of memory that’s reserved for you, the realloc will do that for you.

When you’re done with the memory, usually at the end of the program, the program needs to release that memory back to the operating system with the free function. However, you must only free memory once. That’s because malloc returns a pointer to an area of memory that’s been made available to the program; after this region has been freed, you shouldn’t use that memory again. Freeing the memory a second time is invalid. Here’s a simple program to show what I mean:

#include <stdio.h>
#include <stdlib.h>                    /* malloc */

#define SIZE 5

int main()
{
    int i;
    int *array;

    /* allocate memory */

    array = malloc(SIZE * sizeof(int));

    if (array == NULL) {
        puts("cannot allocate memory");
        return 1;
    }

    /* store values */

    for (i = 0; i < SIZE; i++) {
        array[i] = i;
        printf("array[%d] = %d\n", i, array[i]);
    }

    /* free */

    free(array);
    free(array);                       /* oops, we did it again */

    puts("Ok");
    return 0;
}

At first, this seems like a normal program, allocating five elements in the array and initializing a value before using it. But when the program is done with the memory, it frees the array twice. This can happen if you use a function to do a bunch of cleanup in your program, including freeing memory used by arrays, and forget that your main program also frees the same array on its own. In this case, the program aborts:

$ gcc -o free2 free2.c
$ ./free2 
array[0] = 0
array[1] = 1
array[2] = 2
array[3] = 3
array[4] = 4
free(): double free detected in tcache 2
Aborted (core dumped)

Programming tip: I recommend only using free in the same function that allocates memory. If your main program allocates the memory, it should also be the unit that frees it.

Bonus programming tip: You can also assign the NULL value to a pointer after you free it. The free function takes no action if the array pointer is already NULL.

We all make mistakes

Programming bugs happen to the best of programmers. But if you follow these guidelines and add a little extra code to check for these five types of bugs, you can avoid the most serious C programming mistakes. A few lines of code up front to catch these errors may save you hours of debugging later.

Leave a Reply