When lexer and runtime disagree

Programming

Background

Back in October I began porting a few primitives as well as a subset of std from Rust over to C#. One of the first types that I created was u128. Unlike the terrible decision the C# team made in allowing implicit conversions from integral types to floating-point types, I retained Rust’s more sane approach of requiring an explicit conversion. Unsurprisingly both Rust and C# follow Institute of Electrical and Electronics Engineers (IEEE) 754 round to nearest, ties to even.

When lexer and runtime disagree

When learning a programming language, one often does not care about how literal values are formatted for number-like types (e.g., double) but will simply use the easiest format that seems to work. In C# one way to represent double literals is by appending a d. The code below illustrates two things: a bug that is part of the runtime libraries that handles casts from ulongs to doubles as well as the fact that the lexer uses a different algorithm than the casting code.

#define TRACE
using System.Diagnostics;
namespace Bug {
    static class Program {
        static void Main() {
            Trace.Assert(9223372036854776833d == 9223372036854776833d);
            // This throws a System.Exception.
            Trace.Assert(9223372036854776833d == 9223372036854776833);
        }
    }
}

To make this bug worse, it manifests itself at compilation time. The compiler is smart enough to compile this to a bool literal but does so by relying on the casting code in the runtime libraries. I think this fact makes it more excusable how long I spent pulling my hair out thinking my implementation for casts from Std.Num.U128s to doubles was wrong. The algorithm is pretty basic as the binary representation of floating-point numbers is very similar to scientific notation except with a base of 2, biased exponent, and having to account for “special” values (e.g., NaN).