Use #embed from C23 today with the Cedro pre-processor Español, English

2022-08-04

C23 is around the corner and we already know which features it will add to the C language.

One of them is the new pre-processor directive #embed, which allows including binary files as byte arrays.

#include <stdint.h> const uint8_t image[] = { #embed "../images/cedro-32x32.png" }; #include <stdint.h> const uint8_t image[] = { 0x89,0x50,0x4E,0x47,0x0D,0x0A,0x1A,… 0x00,0x00,0x00,0x20,0x00,0x00,0x00,… ⋮ };

The compiler does not really have to generate this source code, just work “as if” it had done so. This way it is free to insert it directly when linking the object files at the end, without passing it through other compilation phases.

While we wait for our favourite compiler to incorporate this feature, we can start using it immediately with the Cedro pre-processor.

The only difference in the source code is a special “pragma” that tells Cedro to take care of the #embed instead of passing it to the compiler.

#include <stdint.h> #pragma Cedro 1.0 #embed const uint8_t image[] = { #embed "../images/cedro-32x32.png" }; #include <stdint.h> const uint8_t image[] = { /* cedro-32x32.png */ 0x89,0x50,0x4E,0x47,0x0D,0x0A,0x1A,… 0x00,0x00,0x00,0x20,0x00,0x00,0x00,… ⋮ };

When we switch to a compiler that implements #embed, we will only have to remove the line #pragma Cedro 1.0 #embed.

Cedro can be used in two ways:

Filter to transform the file into standard C source code:
cedro program.c >program-standard.c
Wrapper that runs automatically the cc compiler:
cedrocc program.c -o program

Embed as string #embed-as-string

Isn’t it inefficient to generate all those byte literals and compile them? Yes it is. That’s why it was necessary to add #embed to the C23 standard, and Cedro can use another technique which is more efficient, although it depends on the C compiler whether it works on a concrete binary file or not: the option --embed-as-string=1234567 for instance tells Cedro to use string literals if the binary file is smaller than 1234567 bytes. If the file is bigger, it will be inserted with byte literals which is less efficient, but more compatible.

#pragma Cedro 1.0 #embed const char const fragment_shader[] = { #embed "shader.frag.glsl" , 0x00 // Zero-terminated string. }; --embed-as-string=170 const char const fragment_shader[164] = /* shader.frag.glsl */ "#version 140\n" "\n" "precision highp float; // needed only for version 1.30\n" "\n" "in vec3 ex_Color;\n" "out vec4 out_Color;\n" "\n" "void main(void)\n" "{\n" "\tout_Color = vec4(ex_Color,1.0);\n" "}\n";

Note how Cedro realizes that the last byte is zero and removes it, because since there is a string just before, the compiler will add the zero terminator automatically..

In the next example, since there is no zero byte after the #embed line, the size 1559 corresponds exactly to the size of the file. This way we avoid having an excess zero at the end because of using strings.

#include <stdint.h> #pragma Cedro 1.0 #embed const uint8_t image[] = { #embed "../images/cedro-32x32.png" }; --embed-as-string=1600 #include <stdint.h> const uint8_t imagen[1559] = /* cedro-32x32.png */ "\211PNG\r\n" "\032\n" "\000\000\000\rIHDR\000\000\000 \000\000\000 \b\002…" ⋮ …";

The difference can be notable with big files. For instance, these measurements correspond to an 8 MB file, compiled on an IBM Power9 CPU on Linux, all files in RAM disk:

	Generate code		Compile with GCC 11		Compile with clang 12
`cedro`	0.19 s	1.80 MB	27.66 s	1328.13 MB	30.44 s	984.59 MB
`cedro --embed-as-string=…`	0.14 s	1.63 MB	0.99 s	125.32 MB	0.42 s	144.95 MB

There are other ways of incorporating files into C programs. The advantage of using Cedro is that the source code is the same and this way it is very easy to use different compilers, only adding or removing the #pragma line.

Cedro offers other features that might be interesting for you:

The backstitch @ macro.
Deferred resource release auto ... or defer ....
Break out of nested loops break label;.
Notation for array slices array[start..end].
Block macros #define { ... #define }.
Loop macros #foreach { ... #foreach }.
Binary inclusion #include {...}/#embed "...".
Better number literals (12'34 | 12_34 → 1234, 0b1010 → 0xA).

Cedro can be downloaded, under the Apache 2.0 license, at its web page and at Github.