The relevance of C in systems programming
Introduction
The course that I have taught the most -by far- at UC Davis is ECS 150: Operating Systems and System Programming. Every once in a while, a student will complain about the use of the C language in this class. When I’m lucky, they will ask me directly, which is great since I can then address their concern and provide a great deal of insight.
In essence, students will make the claim that: “C is a 50-year old language which is no longer relevant, and has been replaced by C++.”
This is an interesting complaint because it touches on two big misconceptions about C. Hopefully, this article will help spread the good word that using C in an OS class is far from a random and unreasonable choice!
C vs C++
The name C++ is admittedly misleading, but C++ is not a more recent version of C. It’s just another language.
Back when Stroustrup started working on C++ in the early 80s, his goal was to design an object-oriented language (the OO concepts were getting some traction at the time) and he had the idea of taking an existing and popular language (i.e., C) and extending it with classes. So at best, we can say that he “forked” C.
But the two languages have evolved completely separately, each with its own ISO committee. Back 20 years ago, what made people put C/C++ in the same bag was the overlap in terms of syntax. For the longest time, C++ had a pretty bad reputation; Linus Torvalds (Linux’s creator) said it was an “horrible language!”
Nowadays (circa C++11), C++ has actually evolved very far away from C and they don’t really share the same syntax anymore (at least not more than, say, Java vs C#).
Popularity of C
C may be 50-year old but it is still widely used.
It’s not because something is old that it’s automatically irrelevant (otherwise, go tell that to some of my colleagues)! The pace of web development technologies (e.g., one new Javascript framework every other week) can make people think that CS keeps changing fast, but the fundamentals of our discipline actually evolve quite slowly. Modern computer architectures still conform to the principles defined by Von Neumann in the 1940s!
When you look at rankings such as TIOBE, which is one of the most respected rankings for programming languages because it looks at languages in a holistic way, C has consistently ranked 1st or 2nd for the past 40 years. As of March 2021, it is number #1!
Systems Programming and C
Whereas application programming provides services to the user directly, systems programming provides services, often performance constrained, to other software. System software include: operating systems, databases, language interpreters, web servers, etc.
Since systems programming requires a high degree of hardware awareness, it calls for a low-level programming language, such as C since it provides the following features:
- It is portable. Every CPU architecture in existence probably has a C compiler. That’s exactly why most language interpreters use C; this way, it makes the interpreted language portable as well. (that’s why you can run Python on microcontrollers!)
- It is efficient and fast. According to this recent paper, C ranks first for energy-efficiency, and execution time.
- It allows great flexibility for manipulation memory which is sometimes necessary for systems programming (e.g., interpreting memory through various lenses, using pointer dereferencing).
- It has a small and fast standard library (
libc
). On the other hand, C++ has, for example, a pretty large standard library, which can be 10x as large as thelibc
, and has to deal with a bunch of stuff at runtime (calling class constructors/destructors, dynamic casting, exception handling, garbage collection, etc.)
Case-study: Arch Linux
To settle the debate of whether C is relevant or not when it comes to Operating Systems, let’s consider a real use-case.
Arch Linux is a popular Linux distribution, which can provide the user with a minimal install. This minimal install only requires a small number of main packages to be installed (28 packages exactly), which is enough to provide a usable system.
One package is the Linux kernel itself, while the 27 other packages are essential systems applications or libraries, grouped into the base meta-package.
Below is the list of all 28 main packages; for each, I’m including a brief description and its language composition (as computed by github).
Name | Description | Language composition |
---|---|---|
linux |
OS kernel | |
bash |
Shell | |
bzip2 |
compression utility | |
coreutils |
basic utilities | |
file |
file type identification | |
filesystem |
base Arch Linux files | This package doesn’t contain any code; only configuration files and a directory structure. |
findutils |
utilities to find files | |
gawk |
GNU awk | |
gcc-libs |
GCC runtime libraries | These files come from GCC’s codebase and are difficult to single out. GCC is written primarily in C though. |
gettext |
internationalization library | |
glibc |
C library | |
grep |
string search utility | |
gzip |
compression utility | |
iproute2 |
IP routing utilities | |
iputils |
networking monitoring tools | |
licenses |
standard licenses | This package doesn’t contain any code; only textual files. |
pacman |
Arch Linux package manager | |
pciutils |
PCI bus library and tools | |
procps-ng |
system monitoring utilities | |
psmisc |
procfs tools |
|
sed |
stream editor | |
shadow |
account management tool suite | |
systemd |
system and service manager (provides init ) |
|
systemd-sysvcompat |
compatibility layer with old sysvinit |
This is the same code base as systemd above. |
tar |
archive utility | |
util-linux |
random collection of system utilities | |
xz |
compression utilities |
Limitations of this study
These 28 main packages may also have their own dependencies (that is requiring other “non-main” packages to be installed). I have not listed these dependencies because I’ve already spent a lot of time writing this article and have other duties to tend to today! But I have very little doubt that the vast majority of these packages are also written in C. If you have the time to compile the whole list, I’d be happy to include your findings here!
In order to keep a consistent style, I restricted myself to using github for the screenshots showing the language composition of each package. For most of the packages, I was able to find either the upstream codebase directly or an up-to-date mirror on github. For a few packages, I was only able to find relatively old mirrors –however, I don’t not think the language composition for these few packages has drastically changed up until today.
Conclusion
The only serious competition to C’s domination is currently coming from Rust. But Rust is way too young, and we’re still a long road from having actual hardware (like your laptop or your phone) run a mainstream OS written entirely in Rust.
Until then, C will be the language of choice for systems programming, which is why it’s the one you should learn in your OS class :)