Introduction

The course that I have taught the most -by far- at UC Davis is ECS 150: Operating Systems and System Programming. Every once in a while, a student will complain about the use of the C language in this class. When I’m lucky, they will ask me directly, which is great since I can then address their concern and provide a great deal of insight.

In essence, students will make the claim that: “C is a 50-year old language which is no longer relevant, and has been replaced by C++.”

This is an interesting complaint because it touches on two big misconceptions about C. Hopefully, this article will help spread the good word that using C in an OS class is far from a random and unreasonable choice!

C vs C++

The name C++ is admittedly misleading, but C++ is not a more recent version of C. It’s just another language.

Back when Stroustrup started working on C++ in the early 80s, his goal was to design an object-oriented language (the OO concepts were getting some traction at the time) and he had the idea of taking an existing and popular language (i.e., C) and extending it with classes. So at best, we can say that he “forked” C.

But the two languages have evolved completely separately, each with its own ISO committee. Back 20 years ago, what made people put C/C++ in the same bag was the overlap in terms of syntax. For the longest time, C++ had a pretty bad reputation; Linus Torvalds (Linux’s creator) said it was an “horrible language!”

Nowadays (circa C++11), C++ has actually evolved very far away from C and they don’t really share the same syntax anymore (at least not more than, say, Java vs C#).

Popularity of C

C may be 50-year old but it is still widely used.

It’s not because something is old that it’s automatically irrelevant (otherwise, go tell that to some of my colleagues)! The pace of web development technologies (e.g., one new Javascript framework every other week) can make people think that CS keeps changing fast, but the fundamentals of our discipline actually evolve quite slowly. Modern computer architectures still conform to the principles defined by Von Neumann in the 1940s!

When you look at rankings such as TIOBE, which is one of the most respected rankings for programming languages because it looks at languages in a holistic way, C has consistently ranked 1st or 2nd for the past 40 years. As of March 2021, it is number #1!

Systems Programming and C

Whereas application programming provides services to the user directly, systems programming provides services, often performance constrained, to other software. System software include: operating systems, databases, language interpreters, web servers, etc.

Since systems programming requires a high degree of hardware awareness, it calls for a low-level programming language, such as C since it provides the following features:

  • It is portable. Every CPU architecture in existence probably has a C compiler. That’s exactly why most language interpreters use C; this way, it makes the interpreted language portable as well. (that’s why you can run Python on microcontrollers!)
  • It is efficient and fast. According to this recent paper, C ranks first for energy-efficiency, and execution time.
  • It allows great flexibility for manipulation memory which is sometimes necessary for systems programming (e.g., interpreting memory through various lenses, using pointer dereferencing).
  • It has a small and fast standard library (libc). On the other hand, C++ has, for example, a pretty large standard library, which can be 10x as large as the libc, and has to deal with a bunch of stuff at runtime (calling class constructors/destructors, dynamic casting, exception handling, garbage collection, etc.)

Case-study: Arch Linux

To settle the debate of whether C is relevant or not when it comes to Operating Systems, let’s consider a real use-case.

Arch Linux is a popular Linux distribution, which can provide the user with a minimal install. This minimal install only requires a small number of main packages to be installed (28 packages exactly), which is enough to provide a usable system.

One package is the Linux kernel itself, while the 27 other packages are essential systems applications or libraries, grouped into the base meta-package.

Below is the list of all 28 main packages; for each, I’m including a brief description and its language composition (as computed by github).

Name Description Language composition
linux OS kernel linux
bash Shell bash
bzip2 compression utility bzip2
coreutils basic utilities coreutils
file file type identification file
filesystem base Arch Linux files This package doesn’t contain any code; only configuration files and a directory structure.
findutils utilities to find files findutils
gawk GNU awk gawk
gcc-libs GCC runtime libraries These files come from GCC’s codebase and are difficult to single out. GCC is written primarily in C though.
gettext internationalization library gettext
glibc C library glibc
grep string search utility grep
gzip compression utility gzip
iproute2 IP routing utilities iproute2
iputils networking monitoring tools iputils
licenses standard licenses This package doesn’t contain any code; only textual files.
pacman Arch Linux package manager pacman
pciutils PCI bus library and tools pciutils
procps-ng system monitoring utilities procps-ng
psmisc procfs tools psmisc
sed stream editor sed
shadow account management tool suite shadow
systemd system and service manager (provides init) systemd
systemd-sysvcompat compatibility layer with old sysvinit This is the same code base as systemd above.
tar archive utility tar
util-linux random collection of system utilities util-linux
xz compression utilities xz

Limitations of this study

These 28 main packages may also have their own dependencies (that is requiring other “non-main” packages to be installed). I have not listed these dependencies because I’ve already spent a lot of time writing this article and have other duties to tend to today! But I have very little doubt that the vast majority of these packages are also written in C. If you have the time to compile the whole list, I’d be happy to include your findings here!

In order to keep a consistent style, I restricted myself to using github for the screenshots showing the language composition of each package. For most of the packages, I was able to find either the upstream codebase directly or an up-to-date mirror on github. For a few packages, I was only able to find relatively old mirrors –however, I don’t not think the language composition for these few packages has drastically changed up until today.

Conclusion

The only serious competition to C’s domination is currently coming from Rust. But Rust is way too young, and we’re still a long road from having actual hardware (like your laptop or your phone) run a mainstream OS written entirely in Rust.

Until then, C will be the language of choice for systems programming, which is why it’s the one you should learn in your OS class :)