Calling C# from Ruby with NativeAOT and FFI

Martijn Storck

February 12, 2026

Some things in Ruby are slow. Not the kind of slow where you shrug and move on, but the kind where you start thinking about native extensions. The usual answer is to drop down to C, but that’s a big commitment. You trade Ruby’s safety and productivity for manual memory management, a foreign build system and an API that breaks between Ruby versions. It works, but it’s not fun.

There are several languages you could reach for here — Rust and Zig are popular choices, and for good reason. I picked C# because I know it well and I’ve always been impressed by the .NET standard library. It has decades of optimization behind it and covers everything from Unicode normalization to cryptography out of the box. I wanted to see how far that standard library alone could take me.

.NET’s NativeAOT compiles C# ahead of time into a standalone native binary — a .so on Linux, a .dylib on macOS. No .NET runtime needed at load time, no JIT warmup, just a plain shared library that exports C functions. And Ruby’s FFI gem can load any native shared library and call its functions. So I decided to see what happens when you combine the two.

The idea

Take an operation that’s slow in pure Ruby, reimplement it in C# leaning on the .NET standard library, compile it to a native library with NativeAOT, and call it from Ruby via FFI. You get memory safety by default and cross-platform builds from a single codebase. The only unsafe code is a few lines of pointer marshalling at the FFI boundary.

For the benchmark I picked String#parameterize from ActiveSupport — the method that turns "Crème Brûlée!" into "creme-brulee". It’s a common Rails operation (every app that generates URL slugs uses it) and it’s pure Ruby under the hood: I18n transliteration tables, multiple regex substitutions, repeated string allocations. No C extension backs it.

The C# implementation

The C# equivalent is about 20 lines and uses only standard library methods:

private static string ParameterizeCore(string input, char separator = '-')
{
    string decomposed = input.Normalize(NormalizationForm.FormD);

    var sb = new StringBuilder(decomposed.Length);
    bool lastWasSeparator = true;

    foreach (char c in decomposed)
    {
        if (CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.NonSpacingMark)
            continue;

        if (char.IsAsciiLetterOrDigit(c))
        {
            sb.Append(char.ToLowerInvariant(c));
            lastWasSeparator = false;
        }
        else if (!lastWasSeparator)
        {
            sb.Append(separator);
            lastWasSeparator = true;
        }
    }

    if (sb.Length > 0 && sb[sb.Length - 1] == separator)
        sb.Length--;

    return sb.ToString();
}

The transliteration trick is string.Normalize(NormalizationForm.FormD). This decomposes Unicode characters into their base form plus combining marks — é becomes e followed by a combining acute accent. Then we skip all combining marks and keep the base characters. The rest is char.IsAsciiLetterOrDigit(), char.ToLowerInvariant(), and StringBuilder. No regex, no lookup tables, no NuGet packages. The .NET standard library does the heavy lifting.

The FFI boundary

Exporting a C# method as a native function requires the [UnmanagedCallersOnly] attribute. This tells the NativeAOT compiler to expose the method with a C-compatible calling convention:

[UnmanagedCallersOnly(EntryPoint = "parameterize")]
public static IntPtr Parameterize(IntPtr inputPtr)
{
    string input = Marshal.PtrToStringUTF8(inputPtr) ?? string.Empty;
    string result = ParameterizeCore(input);
    return Marshal.StringToCoTaskMemUTF8(result);
}

[UnmanagedCallersOnly(EntryPoint = "free_string")]
public static void FreeString(IntPtr ptr)
{
    Marshal.FreeCoTaskMem(ptr);
}

Marshal.PtrToStringUTF8 reads a null-terminated UTF-8 string from a pointer into a managed C# string. Marshal.StringToCoTaskMemUTF8 does the reverse: it allocates unmanaged memory, copies the string into it, and returns a pointer. That memory is not garbage collected, so the caller has to free it explicitly — hence the free_string export. This is the one part where you have to think like a C programmer, but it’s a few lines of boilerplate, not an entire codebase.

On the Ruby side, FFI loads the compiled shared library and attaches the exported functions:

module NativeLib
  extend FFI::Library
  ffi_lib File.expand_path("../build/NativeLib.dylib", __dir__)

  attach_function :parameterize, [:pointer], :pointer
  attach_function :free_string,  [:pointer], :void

  def self.slug(string)
    input_ptr  = FFI::MemoryPointer.from_string(string)
    result_ptr = parameterize(input_ptr)
    result     = result_ptr.read_string.force_encoding("UTF-8")
    free_string(result_ptr)
    result
  end
end

That’s it. NativeLib.slug("Crème Brûlée!") returns "creme-brulee", identical to ActiveSupport’s "Crème Brûlée!".parameterize.

Building

The .csproj is minimal. NativeAOT is a first-class feature in .NET 10 and only needs PublishAot to be set:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net10.0</TargetFramework>
    <PublishAot>true</PublishAot>
  </PropertyGroup>
</Project>

Then dotnet publish -c Release -r osx-arm64 produces NativeLib.dylib. Change the RID to linux-x64 for a Linux .so, or win-x64 for a Windows .dll. No additional dependencies.

The project also comes with a Dockerfile that uses a multi-stage build: the first stage compiles the native library using Microsoft’s dotnet/sdk:10.0-aot image, and the second stage copies just the .so into a plain Ruby image. The .NET SDK doesn’t end up in the final image.

The benchmark

I tested with 200 realistic product titles containing Unicode characters (French, German, Spanish accents, trademark symbols, em-dashes) processed per iteration. The results on an Apple M-series chip with .NET 10 and Ruby 4.0:

Comparison (Ruby as baseline):
  ActiveSupport parameterize                    631.5 i/s
  C# NativeAOT FFI (batch)                     7723.2 i/s - 12.23x faster
  C# NativeAOT FFI (per-string)                4580.9 i/s - 7.25x faster

The batch variant sends all 200 strings to C# in a single FFI call — one boundary crossing, pure processing speed. The per-string variant calls across the FFI boundary once per string, which is the realistic usage pattern and is still over 7x faster. The benchmark also verifies that the C# output matches ActiveSupport’s output for every test string.

Tradeoffs

This isn’t free. NativeAOT links the .NET runtime statically, so even this trivial library produces a ~1MB binary. Every FFI call has overhead: Ruby allocates a pointer, copies the string, crosses the boundary, the native code runs, and Ruby reads the result back. For a single short string that overhead might not be worth it. For batch processing it fades into the background.

NativeAOT is also not the fastest .NET. The regular JIT runtime can do dynamic profile-guided optimization and tiered recompilation of hot paths — things an ahead-of-time compiler can’t. But for a native library export, that doesn’t matter. We’re not running a long-lived server where the JIT can warm up. We need a standalone binary with a C ABI, and NativeAOT delivers that. Even without JIT tricks, the code runs 12x faster than Ruby.

And of course there are now two languages in your stack. The FFI boundary is a clean seam, but it’s still a seam.

Why C#?

Of course, C# is not unique here: you could do this with Rust, Zig, Go, or plain C. Any language that compiles to a shared library with a C ABI works with Ruby FFI. I reached for C# because I know it and I like the .NET standard library for it’s features and performance. The entire parameterize implementation is five .NET API calls — Unicode normalization, character classification, case conversion, and a string builder. No regex, no third-party dependencies. The FFI-specific unmanaged glue code is a few lines of admittedly slightly obscure C# that marshal the parameters and return values between Ruby and C#.

Try it

The full source code is on GitHub at github.com/martijn/cs-ruby-ffi. You can run the benchmark in Docker without installing anything:

$ docker build -t cs-ruby-ffi .
$ docker run --rm cs-ruby-ffi

This post was co-authored with Claude Code, which also built the project and benchmark with me.