Most C tutorials start with the same line: #include <stdio.h>, but you don't have to. We answer the question: Is C even usable without a Standard Library (libc)?
The answer is yes, but you have to do some work yourself. We restrict to the scope of competitive programming, such that our bad decisions here do not come back to haunt us. First, lets give some motivation.
Why would you do this?
Consider a simple Hello, World! program.
#include <stdio.h>
int main()
{
puts("Hello, World!");
}
We then produce an executable using gcc -Wall -O2 helloworld.c -o helloworld. For what it does, we find it includes a lot of sections that aren't relevant to what we're doing.
[aten@machine misc]$ size -A helloworld
helloworld :
section size addr
.note.gnu.build-id 36 848
.interp 28 884
.gnu.hash 28 912
.dynsym 168 944
.dynstr 141 1112
.gnu.version 14 1254
.gnu.version_r 48 1272
.rela.dyn 192 1320
.rela.plt 24 1512
.init 27 4096
.plt 32 4128
.text 281 4160
.fini 13 4444
.rodata 18 8192
.eh_frame_hdr 36 8212
.eh_frame 116 8248
.note.gnu.property 64 8368
.note.ABI-tag 32 8432
.init_array 8 15824
.fini_array 8 15832
.dynamic 480 15840
.got 40 16320
.got.plt 32 16360
.data 16 16392
.bss 8 16408
.comment 27 0
Total 1917
This "overhead" is justified by the scaffolding that you need for memory management, security features, and runtime initialization that occurs before main() even starts. But what if we weren't concerned with such features? The following C binary did exactly that, and achieves the same thing using just 0.35.
[aten@machine nlctest]$ ./hello
Hello, World!
[aten@machine nlctest]$ size -A hello
hello :
section size addr
.text 218 4198400
.rodata 15 4202496
.note.gnu.property 48 4202512
.bss 408 4206656
Total 689
It's worth noting if we truly wanted the smallest possible binary, we would reach for handwritten assembly. While sites like DMOJ do support NASM x86_64, most competitive programming platforms don't.
Losing the Standard Library
Most of the added sections are a result of linking against glibc. By ditching the standard library, we avoid this altogether. This means we have to implement our own way to do I/O, and find our own way to read ints and strings.
In competitive programming, many believe that manually reading from stdin with getchar() is faster than scanf or cin in C++. In the tips page of dmoj.ca, we find the following snippet.
Finally, if the problem only requires unsigned integral data types to be read, you can prepend this macro to the top of your source:
#define scan(x) do{ \ int _; \ while(((x)=getchar()) < '0' && (x) != -1); \ if((x) != -1) { \ for((x)-='0'; '0' <= (_=getchar()) && _ <= '9'; (x)=10*(x)+_-'0'); \ } \ } while(0)
This suggests we only need to provide an implementation of getchar(). After that, we may implement our own custom logic for reading in negative integers, floats, etc.
To do so, we implement a syscall wrapper, syscall3 to adhere to DRY (don't repeat yourself). Our environemnt is x86_64 linux, so the following suffices:
static long syscall3(long number, long arg1, long arg2, long arg3) {
long ret;
__asm__ volatile (
"syscall"
: "=a" (ret)
: "0" (number), "D" (arg1), "S" (arg2), "d" (arg3)
: "cc", "rcx", "r11", "memory"
);
return ret;
}
From there, a quick look at the syscall table has us arrive at the following.
#define SYS_read 0
int getchar(void) {
static unsigned char buf;
long res = syscall3(SYS_read, 0, (long)&buf, 1);
if (res <= 0) return -1;
return buf;
}
Note that in a real implementation, we would use a larger buffer and only call make the syscall when our buffer is empty. Without such an optimization, our implementation may actually be slower than scanf.
Identically, we can easily implement putchar, pu (print unsigned integer). Lets skip that for now, and implement a simple Hello World! program. Link to the full program.
...
int main(int argc, char *argv[]) {
char *s = "Hello, World!\n";
for (int i = 0; i < 14; i++) {
putchar(s[i]);
}
}
Unfortunately, we get a segfault, with or without the return 0;.
[aten@machine nlctest]$ ./hello
Hello, World!
Segmentation fault (core dumped) ./hello
We are very used to glibc taking care of entry/exit of main. Note that on many competitive programming platforms (citation needed), partial marks/passing is granted despite having UB/segfaults. But for completeness, we will handle this.
The culprit is that when main exits, the RIP (next instruction) pointer is popped off from a stack that libc isn't managing, resulting in the stack pointer going somewhere it's not allowed to. To keep main clean, we define a _start as the true starting point of the program, within which we call main and exit gracefully with a syscall.
void _start() {
int ret = main(0, 0);
syscall3(SYS_exit, (long)ret, 0, 0);
}
Since we are not so concerned about the additional features gcc has, to further reduce the size of the binary, we use the following Makefile:
CC = gcc
CFLAGS = -Os -fno-asynchronous-unwind-tables -fno-stack-protector \
-fno-ident -ffreestanding -nostdlib -static \
-Isysroot/include
LDFLAGS = -s -Wl,--build-id=none -Wl,--no-dynamic-linker
hello: hello.c
$(CC) $(CFLAGS) $(LDFLAGS) -o hello hello.c
@strip -s hello
@ls -lh hello
clean:
rm -f hello
One potential optimization is to ditch our putchar altogether, and simply print using a syscall.
void _start() {
const char msg[] = "Hello\n";
syscall3(1, 1, (long)msg, 6);
syscall3(60, 0, 0, 0);
}
Both approaches gives us our tiny binary, as desired.