This is a (partly) fictional record about how I view C language and libraries.
C is the first language I learnt to “better than average” level, but most more recent programmers don’t have the questionable luxury to have started there and this is much from their point of view too.
Also, there are still a lot of really capable people who use C, but “people who are really good at writing C” are a critically endangered and dying breed who do not seem to be able to reproduce in the wild effectively anymore.
And as we go for threatened status, “people who are at least somewhat decent at writing C” are endangered species too.
How would you write instructions to start up a computer?
Let’s take a relatively simple task from a human perspective, say,
turning on a PC, and see how we would communicate this task in
pseudocode and in C to compare the difference.
I asked a couple of people and got something along the lines of
- Make sure the power cable is attached to your computer
- Make sure the power switch in the PSU is in ON position
- Press the power button
def is_power_attached(): return sys.cable_connected def is_psu_turned_on(): return sys.power_available def press_power_button(): sys.hit_it_jack() def turn_on_pc(): assert(is_power_attached()) assert(is_psu_turned_on()) press_power_button()
Which seems quite reasonable.
This is how you do it in C
Attempt 1
Let’s implement this quickly in C
#include <system.h> #include <assert.h> bool ensure_power_attached() { return sys.cable_connected; } bool ensure_psu_turned_on() { return sys.power_available; } void press_the_power_button() { sys.hit_it_jack(); } void turn_on_pc() { assert(ensure_power_attached()); assert(ensure_psu_turned_on()); press_the_power_button(); } int main(void) { turn_on_pc(); }
And depending on a compiler, if you’re lucky you get something like this:
error: unknown type name 'bool' bool ensure_power_attached() {
and
error: use of undeclared identifier 'sys' return sys.cable_connected; ^ error: must use 'struct' tag to refer to type 'sys' sys.hit_it_jack(); ^ struct error: expected identifier or '(' sys.hit_it_jack();
OK, the first thing to hit is that bool
is (for historical reasons) not a language thing, but a macro that gets defined with #include
. GCC even is kind enough to remind you of that. OK, good, let’s do that.
Simple enough.
Another problem is a bit harder, as it seems like we don’t have this sys like we expected coming from newer languages.
After a quick Google search, we find out that we are supposed to RTFM, and after a while we find a program called man
that gives us all the details about how to use this system library.
After spending half a day trying to dig the simple thing we are trying to do from the information haystack that is any manpage that is not on OpenBSD (or epoll, that actually has a decent manpage), we find out that there are functions.
int cblcnnctd(struct sys_state_t* state); int psustate(struct sys_state_t* state, int* res); int pwr(struct sys_state_t* state);
And from a quick googling we find out that sys_state_t
is some implementation-specific semi-opaque thing we are supposed to handle with a couple of functions.
C people hate vowels for some reason. For somewhat newer libraries and stuff, this might’ve been a lot better named and way more self-documenting, but this is so common that every C programmer is grown too dull to even notice that it might not be the most readable thing.
Ok, we write this again with this in mind.
Attempt 2
With what we know, we quickly refactor the code to
#include <system.h> #include <assert.h> #include <stdbool.h> bool ensure_power_attached(sys_state_t* state) { return cblcnnctd(state); } bool ensure_psu_turned_on(sys_state_t* state, int* res) { return psustate(state, result); } void press_the_power_button(sys_state_t* state) { return pwr(state); } void turn_on_pc() { assert(ensure_power_attached(state)); assert(ensure_psu_turned_on(state, result)); press_the_power_button(state); } int main(void) { turn_on_pc(); }
and try to compile it just so we can get the new error messages
must use 'struct' tag to refer to type 'sys_state_t' bool ensure_power_attached(sys_state_t* state)
We get something like this and a couple of undeclared identifiers, which we expected.
C actually has namespaces. struct
namespace and global namespace, and this sys_state_t*
is in the struct one, obviously. So, add the struct thingies we forgot because we thought they were redundant, since half of the time they are typedef
fed into the global namespace anyways.
Then we are left with the problem that we have this state
structure that we need to handle.
After a quick man or Google search we figure out somewhat how to use it.
#include <assert.h> #include <stdio.h> #include <stdbool.h> bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state); } bool ensure_psu_turned_on(struct sys_state_t* state, int* res) { return psustate(state, res); } void press_the_power_button(struct sys_state_t* state) { pwr(state); } void turn_on_pc() { struct sys_state_t* state; int* result; assert(ensure_power_attached(state)); assert(ensure_psu_turned_on(state, result)); press_the_power_button(state); } int main(void) { turn_on_pc(); }
And if you’ve dealt more with C, you already see where we are going to fail next, but let’s just carry on, since this seems to compile!
Depending on the compiler, and presuming we don’t have the power attached yet, it might either seem to work close to intended, have random misconceptions when something is plugged or turned on or in the best case scenario a straight up segmentation fault.
Yea, this was easy to see for any C programmers with a bit more experience, but it was a pitfall worth mentioning since it is one people constantly seem to fall to. Let’s fix the pointer stuff and see where we get.
#include <assert.h> #include <stdio.h> #include <stdbool.h> bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state); } bool ensure_psu_turned_on(struct sys_state_t* state, int* res) { return psustate(state, res); } void press_the_power_button(struct sys_state_t* state) { pwr(state); } void turn_on_pc() { struct sys_state_t state; int result; assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state, &result)); press_the_power_button(&state); } int main(void) { turn_on_pc(); }
All right, we are going to go through a couple of these easy fixes because, again, a lot of newcomers seem to hit them. Most of the more experienced C programmers probably spotted this one and the next already.
One of the asserts seem to fail terribly often. It might be consistent or not. So what is going on?
We didn’t initialise the state
, so it has random values inside.
Another look at the manual / tutorial / whatever and we fix this.
void turn_on_pc() { struct sys_state_t state; int result; rd_hwst(&state); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state, &result)); press_the_power_button(&state); }
Again, C people hate vowels so we have this rd_hwst(&state)
that updates the struct with the current state of the system. Make sure you remember to take the address here as well! (Although more recent compilers are kind enough to warn you about this, hope you are one of the people that fix all the warnings compilers give you even if shit compiles.)
But the compiling works, and our second attempt is a miserable failure as well! It seems to work somewhat though, as it has a pretty deterministic action at this point.
Attempt 3
Compiling and running gave us the exact opposite of “is power cable attached” than we expected. Rechecking the man page of system.h
-functions tells us that they both return 0 if everything is all right, and otherwise give you some error code that would explain what is wrong. We also notice that we probably should also check the res
int
at this point.
ensure_power_attached
is fixed easily by
bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state) == 0; }
but then we have the psu_turned_on
, which always seems to succeed regardless of whether or not the PSU is turned on.
Well, we didn’t RTFM in time, and some of our previous bugfixing was completely irrelevant, since we weren’t doing the right thing anyways with our ensure_psu_turned_on
function.
We read the manual again, and see that it answers if the state of the PSU was successful, and puts the actual result in res
, and we are interested in the fourth bit of that value only, as it holds the “PSU turned on” value.
Armed with this, we rewrite again.
bool ensure_psu_turned_on(struct sys_state_t* state) { int res; if (psustate(state, &res) == 0) return res & (1 << 3); }
void turn_on_pc() { struct sys_state_t state; rd_hwst(&state); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state)); press_the_power_button(&state); }
We get rid of the result pointer here altogether, because we didn’t need it in the first place, but we just put it there because the prototype said it would need it and we misassumed how it actually worked. Another one of the mistakes that people experienced with C rarely do, but still crop up in codebases with alarming frequency.
Our entire code now looks like this
#include <assert.h> #include <stdio.h> #include <stdbool.h> bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state) == 0; } bool ensure_psu_turned_on(struct sys_state_t* state) { int res; if (psustate(state, &res) == 0) return res & (1 << 3); return false; } void press_the_power_button(struct sys_state_t* state) { pwr(state); } void turn_on_pc() { struct sys_state_t state; rd_hwst(&state); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state)); press_the_power_button(&state); }
We run it, get past both the checks and the power turns on!
We rejoice and prepare to ship, but then we decide to show it to our friend first. Who immediately gets stuck to the assertion ensure_psu_turned_on
.
But it worked for us! Once. We try it again and get the same assertion failure. And then we don’t, and then we do.
But since we are C programmers, we already can figure out what is wrong. There are still parts of uninitialised memory we are accessing because just like the butler in crime stories, the culprit is always a memory error in C.
Without too much effort, we realise that we didn’t init the res
and the state
properly. So we add that.
#include <assert.h> #include <stdio.h> #include <stdbool.h> #include <string.h> bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state) == 0; } bool ensure_psu_turned_on(struct sys_state_t* state) { int res = 0; if (psustate(state, &res) == 0) return res & (1 << 3); return false; } void press_the_power_button(struct sys_state_t* state) { pwr(state); } void turn_on_pc() { struct sys_state_t state; memset(&state, 0, sizeof(state)); rd_hwst(&state); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state)); press_the_power_button(&state); }
And this will actually always get past the both asserts regularly with every reasonable C compiler.
…but the power doesn’t turn on anymore, even if we don’t get stuck in either of the asserts. We can figure that it has something to do with sys_state
, since when it wasn’t initialised, we did have one run where we actually got the power on.
Back to the manual and we finally find one last thing we need to fix, buried in the plethora of unnecessary info on the man page.
to control the system with `pwr_*` and `cmd_*` commands you need to enable user commands
Which, in another, mostly unrelated part of man pages tells you is done by calling set_hwctrl(state, 1)
So we add that to our code and we finally get it working.
#include <assert.h> #include <stdio.h> #include <stdbool.h> #include <string.h> bool ensure_power_attached(struct sys_state_t* state) { return cblcnnctd(state) == 0; } bool ensure_psu_turned_on(struct sys_state_t* state) { int res = 0; if (psustate(state, &res) == 0) { return res & (1 << 3); } return false; } void press_the_power_button(struct sys_state_t* state) { pwr(state); } void turn_on_pc() { struct sys_state_t state; memset(&state, 0, sizeof(state)); rd_hwst(&state); set_hwctrl(&state, 1); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state)); press_the_power_button(&state); } int main(void) { turn_on_pc(); }
This handles our case perfectly, and it is a reasonably short amount of code in the end. All well, all good.
Actually dissecting what we needed to instruct
We originally wanted just simple instructions to “turn the computer on”.
While some saner programming language might even have a single function to do that, even the instructions we asked from our friends did have the couple of checks.
So, translating back from the programming language to human speak.
def is_power_attached(): return sys.cable_connected def is_psu_turned_on(): return sys.power_available def press_power_button(): sys.hit_it_jack() def turn_on_pc(): assert(is_power_attached()) assert(is_psu_turned_on()) press_power_button()
This version basically just says
- Make sure power is attached
- Make sure PSU is turned on
- Press the power button
And uses couple of functions to translate that to “commands”.
We might’ve as well written it
def turn_on_pc(): assert(sys.cable_connected) assert(sys.power_available) sys.hit_it_jack()
and it checks out pretty well. There is nothing wrong with handling this case like this.
Then let’s dissect the C code.
void turn_on_pc() { struct sys_state_t state; memset(&state, 0, sizeof(state)); rd_hwst(&state); set_hwctrl(&state, 1); assert(ensure_power_attached(&state)); assert(ensure_psu_turned_on(&state)); press_the_power_button(&state); }
We get a list of steps:
- Create a checklist for the current state of the system
- Erase all the marks on the checklist, because we are reusing the paper and some of the old stuff still shows there
- Examine and mark to our checklist the hardware connections and switch positions
- Check the box that says you are allowed to touch the hardware
- Check from the checklist if the power is attached
- Check from the checklist if the PSU is turned on by
- Make another checklist without labels
- Erase all the checks in the second checklist
- Mark all the PSU connections and switch positions to our checklist 2
- If the fourth box is ticked, mark the PSU turned on in the first checklist
- Check from the checklist you are allowed to touch the hardware, and push the power button if so
…thanks, C.
But this is not how I write my C code!
Good.
To be honest, there are a lot of C projects out there that avoid a lot of these pitfalls by systemical naming, focusing on robustness and/or correctness and good amount of testing. Compiling with full warnings and warnings as errors would’ve also helped us here.
All that said, there are plenty of examples of good C use, I picked some examples from the top of my head here that are written in easy-to-read, completely understandable C code that makes it harder for end-user to screw up: wlroots, musl, git and GLFW.
But the truth is, the steps outlined in this post kinda wrote themselves. I didn’t have to think too much about how any of the screwups here (and one of them is actually authentic). C has its merits, but no matter how much we want not to admit it, it is getting older and the cracks are starting to show. While the language itself is simple (and there are certain benefits in that too), making complex, correct or safe programs using it is definitely much harder than more contemporary languages.
I’ve heard a 1000 times C programmers tell you “C is not inherently bad”, “you just need to use it the right way” or something along the lines. But the fact is, it is hard to make robust software with C. It is hard to teach C because it forces you to have deeper understanding of how computers behave from the get-go, it is harder to debug and the language doesn’t give the compiler much chances to help you on your way. These make C “inherently worse” in my eyes. Just because someone can write perfect C, or that the language has been crucial in the development of other programming languages doesn’t mean the language itself stands up to scrutiny today. I’m sure there are people who can write robust software in brainfuck if they want to, that doesn’t make the language good for daily use. It is completely possible to write good, bug-free, readable C code, but there are not many people capable of doing that, and there aren’t many new C gurus being made either. Fortunately, the language is so omnipresent that we have a somewhat steady supply of C people still.
And I know C definitely has its place though. It is probably completely irreplaceable in small self-hosting systems (which are much more important than most people realise), it definitely is the lingua franca between different programming languages, and it is easily the most portable language around. And it’s not going anywhere anytime soon, the OS you are running now alone has probably millions of lines of C code inside it.
But Rust people have a point. We humans simply make too many errors that cost actual money with C code, and while Rust may not be the final solution,
we should prepare ourselves for changes coming in the systems programming land and figure out how we should move forward.