one of the weirdest things in the modern programming landscape is just how dominant object oriented programming is as a paradigm. and it’s not even that it’s just popular, it’s taken over the way people even think about programming in the first place. now, i’m not someone who thinks that oop is literally the devil or anything–i think that it absolutely has its time and place, and when the appropriate job arises it can be really helpful for that.
this said, it would be disingenuous of me to say that oop is nothing but good. of course that’s far from the truth, as many keyboard warriors online will tell you with great passion. i get annoyed listening to their twitter-brained takes on oop too, believe me, but they also kinda have a point. oop has really given license to too many software engineers to just go ham and write some of the most obtuse, overly engineered solutions that end up being a huge pain to maintain. like, when you’re 6 inheritance levels deep, have no idea where a member of the class you’re working on came from, and get weird compile time errors because someone didn’t quite guard their templated class just right… yeah you know we’ve gone off the deep end.
this is why i’ve come to appreciate c over time. i think as a language it’s sort of beautiful in how simple it is. you learn the basic rules of play, and that’s it. no hidden weirdness, no abstractions that run behind your back and do unexpected things. it’s just you talking to the computer in plain terms.
i think this scares a lot of people off though, especially those who have really only known oop. you’ve gotta be so explicit, and you can’t write fancy abstractions to reduce your line counts drastically. maybe you’re one of these people.
let’s see if i can change your mind :)
why should i give c a chance?
i hear you saying. c definitely has a reputation for being difficult, low level, and outdated. and look, some of this is true. hell, i’m looking at odin currently to replace c as my language of choice in projects because even i have to admit that c can suck to write sometimes, and some of these new languages have some really compelling ideas.
but c i think has some good things going for it, even if you wanted to look at something like odin, zig, rust, what have you.
c is simple
there’s very little that’s actually complicated about c, it’s just that you require a little bit more knowledge about how computers work before you can get properly started compared to something higher level such as c# or js. but i would honestly argue that in many cases, the higher level languages can be doing you a disservice with how much they abstract away the inner workings of your computer. you see, through understanding how things work, you can have a better grasp on not only how why things may be breaking, but also how to make your solutions better. even in the high level world of js!
c forces you to think simply
okay this sounds very counter-intuitive, especially if you only know c for how it requires you to manage your own memory, but trust me on this. the thing with c is that there is nothing baked in save for really the bare minimum to get you off the ground and writing software. because of this, you have to try and think in simple ways to achieve complex solutions, which can sometimes be better solutions because of how simple the building blocks end up being. and this isn’t even just because it’s more maintainable, but also because they can be faster.
c can be stupid fast
as an abstraction over machine code, c is pretty close to the real deal. this means that whatever you write will very likely be fairly close to whatever machine code ends up being generated.
however, i do need to emphasise that it only can be really fast.
because of course if you’re not careful and write poor performance code, you’ll end up generating poor performance machine code. but since there’s very little in the way of weird code generation happening in the background like there can be in more complex, fully featured languages, you can be much more sure about how your code will perform while running.
so let’s start having a look at c itself.
the rules of play
i want to skip over the specifics of how c works such as what the compiler is, what the linker is, what including files does, etc. this is all stuff with plenty of great online resources already, and talking too much about them would balloon this post out a lot. let this be your warning: try to at least have a basic understanding of c and how c works before reading ahead. instead, let’s start with those rules of play that i mentioned before.
booleans
they don’t exist in c. well, not really. instead, int
s act as your bools. for example:
int condition = 1;
if else
/*
* output:
* condition is true
*/
that’s it. if a value is 0 then it is false, if it’s not 0 then it’s true.
pointers
this is what i think scares most people in c, but pointers are really easy. are you ready for the secret to pointers?
pointers are just memory addresses.
for the uninitiated, a memory address is the number which represents where your variable exists in memory. if you have that information then you can functionally point to where it is, and you can retrieve the variable itself.
seriously, that’s it.
if that still doesn’t really stick, then maybe an example will help:
int foo = 10;
int* foo_ptr = &foo;
;
;
foo = 20;
;
/*
* output:
* foo is 10
* value pointed at by foo_ptr is 10
* value pointed at by foo_ptr is 20
*/
here we define foo
and assign it the value 10. then we declare foo_ptr
and assign it a pointer to foo
. next we
print out the value of foo
and the value pointed at by foo_ptr
, both of which are 10. finally we assign foo
the
value of 20 and print out the value pointed at by foo_ptr
, which is now 20. hopefully it’s obvious at this point that
foo_ptr
is just a pointer to the memory address that we store foo
in, but also there’s some wacky syntax there. this
is also really simple:
// define the variable `foo` of type `int`
int foo = 10;
// define the variable `foo_ptr` of type `int*`
int* foo_ptr;
// assign `foo_ptr` a pointer to the variable `foo`
foo_ptr = &foo;
there are two tokens used when we talk about pointers: *
and &
:
- when
*
is to the right of the type, this is to declare/define a variable to hold a pointer - when
*
is to the left of a variable name, this is to dereference a pointer and get the underlying value - when
&
is to the left of a variable name, this is to get a pointer to that variable
and that’s really it! pointers are no more complicated than that.
memory allocation
by default anything you define will be allocated on the stack. since it’s allocated on the stack, it will be freed once the scope it was defined in is left.
int foo = 1;
; // "foo is 1"
// exit scope
; // error!!
in the above example we can see that because bar
was defined in that new scope, once we left the scope we could no
longer access it since it was freed.
you can also allocate to the heap if needed, and for this you will need to manually manage your memory with malloc()
and free()
.
int* foo = ; // allocate an int on the heap
*foo = 10;
;
; // important!!
here we allocate some memory on the heap with malloc()
, use it, and then free it with free()
. it’s important that
you free the memory at the end, since not doing so can lead to memory leaks (more on this later).
arrays
arrays in c are simple, tightly packed collections of data stored in continuous memory. that sounds complicated, but let’s unpack what that all means:
- we know that when data is stored in memory, it’s allocated enough space to store that data type
- for example, if you were to declare a
uint32_t
then you would have to allocate 4 bytes in memory (32 / 8 == 4 bytes)
- for example, if you were to declare a
- a collection of data is exactly what it sounds like: a bunch of data of the same type grouped together
- when this data is tightly packed, this means that each bit of data is stored one after the other in memory, with no spaces in between
this leads to some interesting things. most notably, and this is probably something you’ll run into pretty quickly as you start writing c, but arrays have a fixed length. and what’s more, if you want to have your array allocated on the stack then you’ll need to make sure that the size of the array is known at compile time. otherwise you need to allocate your array on the heap.
another thing worth noting is that arrays are just pointers to the first element of the array, and when you index an array you’re essentially just doing pointer arithmetic. for example:
int foo = ;
// regular array indexing
;
// indexing with pointer arithmetic
int i = 1;
int* second_element = foo + i;
;
/*
* output:
* the second element is 2
* the second element is 2
*/
both result in the same output because we’re functionally indexing them in the same way. mind you, you shouldn’t be doing this, but i want to get the point across that arrays are just pointers to the first element of the array. remember this, because it’s a very important thing (and something that can trip up those new to c).
structs
there are no objects in c, but you still want to create named containers for your data sometimes. this is the purpose of structs in c, and they have very simple syntax:
;
struct MyStruct ms;
ms.foo = 1;
ms.bar = 2;
;
;
/*
* foo is 1
* bar is 2
*/
macros
macros are bits of code that are processed at compile time during what is called the pre-processing stage. to understand what this means, take a look at the following code:
;
/* output:
* FOO is 10
*/
on a surface level this looks like we’ve defined a variable called FOO
but in a strange way. what’s actually happened
though is we’ve defined a macro, and macros are a pre-processor directive which will expand at compile time. so the
above code once parsed through the pre-processing stage will look like this:
;
notice how FOO
was replaced by the value 10, which is what we defined FOO
as. this is what macro expansion does:
replaces all instances of that macro with what it was defined as. this seems not that useful, after all why would you
want to do something like this?
to understand this, let’s have a look at the assembly generated by gcc
for x86-64 platforms:
void
.LC0
.string "%d\n"
:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR , 10
mov eax, DWORD PTR
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
nop
leave
ret
void
.LC0
.string "%d\n"
:
push rbp
mov rbp, rsp
mov esi, 10
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
nop
pop rbp
ret
what the assembly actually does isn’t actually all that important here. what is important is that the version of the
code with the regular variable has a few more instructions, notably the mov DWORD PTR [rbp-4], 10
to assign foo
,
the mov eax, DWORD PTR [rbp-4]
to load foo
into the register eax
, and then finally mov esi, eax
so we can
prepare the esi
register for our call to printf
. in the second version where we just have the macro, we skip all
these steps and just do mov esi, 10
to prepare for the call to printf
. in short: this allows us to do quite a few
things quicker because we’re explicitly skipping the need to load the value into memory and instead hardcode it into the
program memory.
the cool and useful stuff
okay you should have a decent idea of what we’re working with here now, so let’s discuss the kinds of things that will (hopefully) change your mind on c–if not to enjoying it, to at least appreciating it for what it is.
unions
unions are a fun one that i actually rarely see talked about outside of the context of lower level programming, and it’s a bit of a shame really! let’s have a look at an example to understand why i think they’re pretty useful:
;
;
;
struct MyStruct ms;
ms.u_type = kFoo;
ms.data.foo = 1;
switch
/*
* output:
* has foo! value is 1
*/
what we’ve done here is implemented a pattern called a discriminated union. unions in the context of programming are structures which are the size of the largest member, and can only contain one of those members. this sounds kinda goofy and esoteric at best, until you add the discriminated part to it. a discriminated union is just a union with a way to identify what the contained data inside the union is, and this lets you do things like the following:
// assume the above definitions of `enum UnionType`, `union MyUnion`,
// and `struct MyStruct` exist
struct MyStruct
// populate the array
struct MyStruct lotsa_stuff;
for
// print out the contents of the array
for
/*
* output:
* 1: foo (1)
* 2: bar (2.0)
* 3: foo (3)
* 4: bar (4.0)
*/
hopefully this makes clear why discriminated unions are so useful. you could almost think of it like polymorphism in oop.
almost.
but don’t take it too far, keep it simple :)
creating systems
a pattern i see all too often with programmers writing object oriented code is the idea of systems abstracted away into a class. it’s a nice idea in theory, but in practice i think leads to quick over-complication if you aren’t careful. this does lead to some programmers trying c and being baffled at how they could possibly make it work in a clean and maintainable way, but here’s a lil trick i’ve learned: y’all be only creating one instance of these classes.
enter the translation unit. in short, a translation unit is whatever you feed into the compiler to create a single
object file, which usually has the extension .o
. generally this will just be a single .c
file, and this is quite
a powerful thing. let me show you why:
// in system.h
void ;
int ;
// in system.c
static int counter = 0;
void
int
in the above example we’ve created a very simple system for incrementing a counter, but the curious thing is that we
have static int counter = 0
declared at the top of the source file. the static
keyword is a little different to what
you might be used to if you’re coming from oop, because there it usually means that you’re declaring something that is
“global” to that class. in c though, it has two meanings depending on context:
- as a variable/function definition in the scope of the source file, it’s a symbol only visible to that translation unit
- as a variable definition inside a function, it’s a variable only visible inside that function which persists between calls
using this knowledge, it’s fairly trivial to then create a system that is self contained, hides members that users of the api shouldn’t be able to see/touch, and is highly maintainable. you can even extend these systems with structs:
// in system.h
void ;
int ;
void ;
void ;
int ;
// in system.c
;
static int system_count = 0;
static struct SystemContext contexts;
void
int
void
void
int
this would make the usage look something like this:
;
int id = ;
;
;
;
int value = ;
;
;
/*
* output:
* value is 3
*/
it’s a fair bit more code, but the concept at its core is still the same. the only difference is now we have multiple
“contexts” for counters, and the user of the system needs to pass through a valid system_id
to use the system. you can
think of this as your replacements for classes, if you’d like, though of course don’t go overboard with it since it can
always be prone to bugs due to over complication. still, this approach ends up being quite nice since it’s easier to
optimise due to how much simpler c is to compile down to machine code compared to something like c++, and it’s really
not all that hard to understand while you’re at it.
memory management
this is probably the biggest sticking point people have against c, and i do see their point. memory management in c is almost entirely hands off from the language’s perspective, for better or for worse. that can make it incredibly powerful, but also a bit of a foot gun if you’ve never had to do something like this before. and, well, if you’re coming from a language such as c# or java (both of which are garbage collected) then you’re probably terrified of the idea of even touching memory directly!
let’s get this out of the way first though: as much as you can, allocate on the stack. you’re just going to have an easier time managing your memory if most of it is on the stack, since so much of that management is already taken care of for you.
but sometimes you will need to allocate memory on the heap for one reason or another. this is when things get spooky for a lot of people. but it doesn’t have to be! just as long as you try to have some sort of pattern about how you manage memory. there’s some competing ideas out there, but my favourite (since it catches many cases quite elegantly) is to have a very clear owner of allocated heap memory that will manage the allocation and freeing as required.
take for example if we wanted to have a sort of dynamic stack implementation:
// in stack.h
void ;
int ;
// in stack.c
;
static struct StackNode* stack_head = NULL;
void
int
and then if you were to use this code:
;
;
;
int value;
while
/*
* output:
* popped value 3
* popped value 2
* popped value 1
*/
this is just a simple singly linked list used as a stack collection with the elements allocated on the heap. notice though that despite all the memory being allocated on the heap, we don’t have to handle anything related to memory. this is because all the memory management is handled solely within the stack system itself. something like this is ideal so that responsibility is entirely contained within a small area of code, making memory management much easier and your code a lot more maintainable.
but say you can’t use this pattern for one reason or another. you find yourself in a position where you need to
allocate memory in all sorts of strange places and it’s hard to ensure that you’re matching every malloc()
call with a
free()
. well, what if you could make it really lazy?
// in mem_system.h
void ;
void* ;
void ;
void ;
// in mem_system.c
static void* allocated_memory;
static size_t current_allocs = 0;
void
void*
void
void
with the usage of the above code looking like this:
;
int* foo = ;
float* bar = ;
*foo = 10;
*bar = 3.14f;
;
;
;
this is a simple extension on the systems idea from before, but this time tracking the pointers we’ve allocated so that we can allow the system to manage deallocation as needed–including deallocating everything we’ve allocated all at once. truly, it’s like magic!
mind you, this isn’t a perfect system for every application, and certainly there are things you’d want to do in order to make this more useful (for example, you’d 100% want to allow the user of the system to create new allocation trackers so they can have multiple going at the same time that then free only the memory allocated with that tracker) but this should show you the kind of things you can come up with when you use a little creativity.
pitfalls
one of the downsides to c is that because it’s so simple, there’s also very little in the way of guards to stop you from shooting yourself in the foot. this is by no means an exhaustive list of things that may go wrong, but should at least get you started so you don’t fail too hard early on.
strings
strings are kinda weird in c because they’re just an array of char
s. i’ll be honest, it’s probably the one thing that
i really truly hate about c (that and const
). on the surface they seem simple, but with the above knowledge in mind
things quickly become a lot more complicated than you may first realise.
first of all, c strings are null terminated. this means that when you do something like this:
const char* foo = "yee haw";
;
printf()
only knows to stop reading char
s when it hits a null character (you’ll see it written as '\0'
in c). this
is important to keep in mind, because if you’re working on a string and fill out the entire array without leaving room
for a null character, you run the risk of certain functions reading your string and going well past the bounds of the
array until it hits something it interprets as a null character.
next thing to bear in mind is that you generally can’t do something like this:
char*
const char* name = ;
;
and the reason for this is that to_return
in get_name()
is allocated on the stack, and so the lifetime of that
memory is until we exit get_name()
. if we deallocate to_return
but return a pointer to it, all of a sudden we’re
pointing at random junk! a way around this is to use malloc()
:
char*
so now to_return
points to memory on the heap which has a lifetime of until we call free()
on it, but this is a bad
solution because we’ve now created a whole other problem: who will free the memory? we’ve allocated this memory and now
it’s on the caller to free it, but that’s not clear and is prone to bugs. instead, you should do something like what is
often seen in the <string.h>
header:
void
here we place the responsibility on the caller to allocate the memory, telling us where that memory is located and how much space is available for us to write to. because of this, the caller now has full control over where the string is, and can then appropriately handle allocation and deallocation as required.
but hold on a second, what’s the deal with strncpy()
there? we need a whole function just to copy a string over?
and this is the last thing i’d like to touch on for strings in c. as soon as you work with strings at all, there’s a 99% chance you’ll also have to start working with the various string functions, which means working pretty directly with memory. again, i really hate strings in c and this is why. it’s just so fiddly and potentially unsafe in subtle ways that may only reveal itself in the most edge of edge cases. and of course, since a lot of user input will be via strings, this can lead to odd crashes/undefined behaviour at best, and security issues at worst.
the best advice i can offer here is to just try and learn by doing. of course, you should use the safe string functions
such as strncpy()
, strncat()
, strncmp()
, really any string functions that require you to pass through a max length
so that you don’t accidentally read/write past the bounds of any allocated memory. but also, this isn’t foolproof! you
gotta make sure that your strings are properly null terminated, and there’s still probably some things that i’m not
aware of either!
don’t let this scare you from writing c, but do try to be aware of the dangers of strings in c.
not initialising allocated memory
say we do something like this:
int foo;
;
/* output:
* foo is 37624896
*/
huh? what’s going on here? we didn’t even assign anything to foo
!
and… well this is the problem. c by default doesn’t initialise the memory you allocate, leaving the value to be
whatever was last in that bit of memory. this is why (at least when i ran the above code) the value of foo
was some
weird number.
this is especially problematic with strings, which is why it’s usually a good idea to do the following if you’re defining a buffer for a string:
char buffer;
;
this gives you a little safety net, ensuring that all characters are null to begin with so that if for whatever reason something goes wrong while you’re working with the string, there should hopefully be a null character left around to mark the end of the string.
not freeing allocated memory
perhaps you’ve experienced a program that you’ve left open for some time slowly eat up all your memory until you restart
it. this is probably a memory leak in action, and this is what happens when you don’t appropriately call free()
for
every call to malloc()
you make. i’ve already covered above some ways to avoid this problem, but i really need to
stress that you make sure you free all the memory you allocate on the heap. learn how to use a tool like
valgrind so you can track down leaks and confirm if there are even any in the first place.
pointer code style
this is a hot debate among many c programmers. in short, you usually see pointers declared in one of three ways:
int* a;
int *b;
int * c;
all of these ways of declaring a pointer is valid, but can be kinda confusing if you’re new to c and reading someone
else’s code. the reasons why people go one way or the other isn’t mega important (though i stick to style a since i
think it makes most sense to describe it as “this variable is of type pointer to an int
”) just know that these are
styles out in the wild you may see.
dangling pointers
say for example you do something like the following:
int* foo = ;
// do some stuff with foo
;
all is good right? we have allocated the memory and then deallocated it appropriately, so there’s no leaks to be seen
here. but there’s still danger! foo
is a valid pointer still, and because of this we can still totally dereference it
and see what’s there. but this will lead to odd behaviour since we’ve indicated that the memory is now free for the
taking for something else, so now whatever is there is anyone’s guess. this is what we call a dangling pointer, and to
deal with dangling pointers it’s usually a good idea to also ensure you set them to NULL
:
foo = NULL;
this means that now later in your code if you happen to try and deref a pointer you can do this to ensure it’s safe:
if
;
simple! but also easy to forget!
macros
“but dan” i hear you ask, “macros seem pretty harmless from what you’ve explained” and yeah they are. but i’ve also not shown you the other things macros can do. mind you there’s some other helpful uses, such as being able to determine what platform you’re currently compiling for so you can conditionally compile some code and not others, but it can also do some spooky stuff.
you see, i very explicitly didn’t explain function-like macros and how those can be used. reason being is that… well honestly i think they’re a trap. in fact, i think that most code generation stuff is a trap in most situations (yes, this means generics/templates too). mind you, in simple small doses they can be very helpful, but too often they lead to the creation of code that is just painful to maintain. so just trust me on this: stay away from function-like macros. you probably don’t need them anyway.
some closing thoughts
i’ll be honest, while i’ve been sitting on this idea for a post for… probably a year now? i didn’t feel properly compelled to write it until i read this incredible post on the grug brained dev. it’s a goofy post, but if you feel at all encouraged to try writing some c after reading this then i cannot recommend enough that you read it. i think it’ll really open your eyes to a simpler way of programming that not only will make writing c a joy, but also any other language you encounter.
but yeah, i think that c is pretty under-appreciated these days. especially in the current landscape of oop and ever increasing complexity in the service of less code repetition, i think we’ve lost sight of the value of true simplicity (myself included). certainly writing more c in my spare time instead of c++ or even rust has rewired my brain in ways that i think has ultimately lead to me to write better code at work.
so give it a go, write something fun in c :)