How can we make enums better in C#?
I wrote about several complaints I have with C#, and one of them was enum
s and how they are really just type aliases for the integer type that backs them. I also mentioned how in languages like Haskell and Rust data
constructors and enum
variants are not actually types but instead functions. While there is no denying true sum types in those languages would be useful, data
and enum
s are still very pleasant to work with in those languages. How can we simulate such a thing in C#?
The answer is to build our own tagged unions. We simply define a struct
that contains an enum
as its tag and all the possible variants as fields. We don’t expose the constructor but instead only expose static
functions whose names correspond to the names of our variants. Assuming we are dealing with types that can be manually aligned, we then use System.Runtime.IteropServices.StructLayout
to define the field alignment which can among other things allow us to stack fields on top of each other reducing the memory.
using Std;
using Std.Convert;
using Std.Result;
using System;
using System.Runtime.InteropServices;
namespace Example {
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode, Pack = 8, Size = 8)]
readonly struct Foo {
[FieldOffset(0)] internal readonly ulong Val;
}
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode, Pack = 8, Size = 8)]
readonly struct Bar {
public Bar() => Val = string.Empty;
[FieldOffset(0)] internal readonly string Val;
}
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode, Pack = 8, Size = 16)]
readonly struct Fizz: IInto<string> {
public Fizz() => (Val0, Val1) = (string.Empty, uint.MinValue);
[FieldOffset(0)] internal readonly string Val0;
[FieldOffset(8)] internal readonly uint Val1;
public readonly string Into() => $"Val0: {Val0}, Val1: {Val1.ToString()}";
readonly Result<string, Bottom> ITryInto<string, Bottom>.TryInto() => new(Into());
}
[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode, Pack = 8, Size = 24)]
readonly struct Buzz {
public Buzz() => throw new InvalidOperationException("Parameterless constructor is not allowed to be called.");
Buzz(Foo foo) {
(_bar, _fizz) = (default, default);
(_foo, Var) = (foo, Tag.Foo);
}
Buzz(Bar bar) {
(_foo, _fizz) = (default, default);
(_bar, Var) = (bar, Tag.Bar);
}
Buzz(Fizz fizz) {
(_foo, _bar) = (default, default);
(_fizz, Var) = (fizz, Tag.Fizz);
}
[FieldOffset(0)] readonly Bar _bar;
[FieldOffset(0)] readonly Fizz _fizz;
[FieldOffset(8)] readonly Foo _foo;
[FieldOffset(16)] internal readonly Tag Var;
internal static Buzz Foo(Foo foo) => new(foo);
internal static Buzz Bar(Bar bar) => new(bar);
internal static Buzz Fizz(Fizz fizz) => new(fizz);
internal readonly Foo IntoFoo() => Var == Tag.Foo ? _foo : throw new InvalidOperationException($"The Buzz variant, {Var.ToString()}, is not Foo.");
internal readonly Bar IntoBar() => Var == Tag.Bar ? _bar : throw new InvalidOperationException($"The Buzz variant, {Var.ToString()}, is not Bar.");
internal readonly Fizz IntoFizz() => Var == Tag.Fizz ? _fizz : throw new InvalidOperationException($"The Buzz variant, {Var.ToString()}, is not Fizz.");
internal enum Tag: ulong {
Foo = ulong.MinValue,
Bar = 1ul,
Fizz = 2ul,
}
}
static class Program {
static void Main() => Console.WriteLine(Wuzz(Buzz.Fizz(new())));
static string Wuzz(Buzz buzz) => buzz.Var switch {
Buzz.Tag.Foo => $"Buzz(Foo({buzz.IntoFoo().Val.ToString()}))",
Buzz.Tag.Bar => $"Buzz(Bar({buzz.IntoBar().Val}))",
Buzz.Tag.Fizz => $"Buzz(Fizz({buzz.IntoFizz().Into()}))",
_ => throw new InvalidOperationException($"The buzz variant, {buzz.Var.ToString()}, is not valid."),
};
}
}
Now that is a lot more crap we have to write than what it would take in Rust or Haskell. It also illustrates that we cannot hide all constructors since a struct
always has a parameterless constructor. We also throw nasty exceptions when one calls the incorrect Into
function with the expectation that one uses Var
to know which variant a Buzz
instance actually is. Ignoring all of those ugly warts, we can have some semblance of a sum type with little overhead. Of course it would be better to just write the code in a different language.