Flavors of enums with Rust bindgen

How to generate a Rust enum from C code

Discovering new options of enum generation

While working on updating the bindings for libbpf-sys since the release of libbpf v0.8.0, I realized I wasn't a fan of the way libbpf's enum bpf_func_id was being generated through rust-bindgen, an FFI library that generates bindings for Rust to C. This isn't an overview on how to use the library but a deep dive on its particular enum generation options.

The current bindgen translation of bpf_func_id looks like this:

pub const BPF_FUNC_unspec: bpf_func_id = 0;
pub const BPF_FUNC_map_lookup_elem: bpf_func_id = 1;
pub const BPF_FUNC_map_update_elem: bpf_func_id = 2;
pub const BPF_FUNC_map_delete_elem: bpf_func_id = 3;
pub const BPF_FUNC_probe_read: bpf_func_id = 4;
pub const BPF_FUNC_ktime_get_ns: bpf_func_id = 5;

You can ignore the weird casing and prefix since that occurs in C rather than bindgen. The issue for me more so was why couldn't I have this directly translated in a proper Rust enum? Turns out, bindgen actually implements several different ways of mapping enums into Rust.

At the time of this writing, the latest version of rust-bindgen is v0.59.2.

A Github repo with examples

I ended up creating a repository dedicated to showing the different styles here: @mdaverde/bindgen-enum-flavors

You can see most of the options for enum generation in the repo's build.rs. I think the API could be simplified but it's straightforward enough to understand quickly. The generated bindings can be seen in src/bindings.rs and the original C header file is enum.h.

The flavors and descriptions of each

constified_enum

This is the default when no other is specified.

Origin:

enum meals {
    breakfast,
    lunch,
    dinner
};

Bindgen:

pub const meals_breakfast: meals = 0;
pub const meals_lunch: meals = 1;
pub const meals_dinner: meals = 2;
pub type meals = ::std::os::raw::c_uint;

You can see that it generates just Rust consts with the same type specified. This is similar to how C enums work in practice.

In the options, you can also change whether the enum name should prepend and if the int type should be set to c_uint or the direct int size u32.

constified_enum_module

Bindgen:

pub mod game {
    pub type Type = ::std::os::raw::c_uint;
    pub const win: Type = 0;
    pub const lose: Type = 1;
    pub const draw: Type = 2;
}

This still keeps the const nature of the enum values but wraps it in a module for encapsulation and more Rust-like ergonomics: game::lose.

newtype_enum

Bindgen:

impl planet {
    pub const earth: planet = planet(0);
}
impl planet {
    pub const jupiter: planet = planet(1);
}
impl planet {
    pub const saturn: planet = planet(2);
}
impl planet {
    pub const mars: planet = planet(3);
}
#[repr(transparent)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub struct planet(pub ::std::os::raw::c_uint);

You can deduce from the generated bindings that this allows traits to be implemented for this enum which is pretty useful to extend the enums with custom functionality.

bitfield_enum

This flavor is the newtype_enum but with bit operation traits implemented.

Bit impl example:

impl ::std::ops::BitAndAssign for animal {
    #[inline]
    fn bitand_assign(&mut self, rhs: animal) {
        self.0 &= rhs.0;
    }
}

rustified_enum

Bindgen:

#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum color {
    purple = 0,
    red = 1,
    blue = 2,
    green = 3,
    yellow = 4,
    pink = 5,
    indigo = 6,
    brown = 7,
    black = 8,
    white = 9,
}

This is nice! So clean. But with potentially unsafe tradeoffs. With all the previous flavors, a user could use the type to generate values that aren't part of the original enum:

// constified_enum
let meals_supper: meals = 3; // Not part of the original enum
// constified_enum_module
let postponed: game::Type = 3; // Not part of the original enum
// newtype_enum
let pluto: planet = planet(4); // Sad

This is allowed by design. Why? Because in C it's allowed (whether it's recommended...). Enums are basically just ints. You can return "foreign" values in a type signature that specifies an enum.

This means that if you choose this enum flavor you have to know that this won't happen in your FFI functions or else it's undefined behavior. If you own the C library and you know that no foreign values will be returned when an enum is expected, then use this flavor. Otherwise, you're better off sticking to the other styles.

rustified_non_exhaustive_enum

The same as rustified_enum but with the #[non_exhaustive] attribute added.

Update - 06/14/22: As per feedback, I wanted to add clarification to this. This can also cause UB! The reason being that #[non_exhaustive] only enforces users of the enum to handle the possibility of other variants but does not require the compiler to assume other variants exist. In other words, the compiler can decide to not include the wildcard arm in the final binary if it detects its dead code. Therefore, if your FFI function returns a variant outside of the described enum, this is undefined behavior. For more information, check out this issue and this example.

Priority of enum flavors

Bindgen actually implements a priority to the enum flavors in case multiple are specified for the same enum (most likely with the use of patterns). If there are conflicts this is the order:

  1. constified_enum_module
  2. bitfield_enum
  3. newtype_enum
  4. rustified_enum
  5. rustified_non_exhaustive_enum
  6. constified_enum (default)

You can change the default with the default_enum_style option.

Conclusion

I debated over these options for bpf-rs but I realized it was eventually more fruitful to bring in the enum directly myself for the eBPF helpers. The downside to this is that this enum will need to be maintained with future libbpf updates but that shouldn't be often (famous last words) and I wrote a cargo test to help me catch mismatches.

References

Written by