Header menu logo F# Compiler Guide

Compiling Equality

This spec covers how equality is compiled and executed by the F# compiler and library, based mainly on the types involved in the equality operation after all inlining, type specialization and other optimizations have been applied.

What do we mean by an equality operation?

This spec is about the semantics and performance of the following coding constructs

It is also about the semantics and performance of uses of the following FSharp.Core constructs which, after inlining, generate code that contains an equality check at the specific EQTYPE * HashIdentity.Structural<'T> * {Array,Seq,List}.contains * {Array,Seq,List}.countBy * {Array,Seq,List}.groupBy * {Array,Seq,List}.distinct * {Array,Seq,List}.distinctBy * {Array,Seq,List}.except

All of which have implied equality checks. Some of these operations are inlined, see below, which in turn affects the semantics and performance of the overall operation.

ER vs PER equality

In math, a (binary) relation is a way to describe a relationship between the elements of sets. "Greater than" is a relation for numbers, "Subset of" is a relation for sets.

Here we talk about 3 particular relations: 1) Reflexivity - every element is related to itself - For integers, = is reflexive (a = a is always true) and > is not (a > a is never true) 2) Symmetry - if a is related to b, then b is related to a - For integers, = is symmetric (a = b -> b = a) and > is not (if a > b then b > a is false) 3) Transitivity - if a is related to b, and b is related to c, then a is also related c - For integers, > is transitive (a > b && b > c -> a > c) and is not (a = √b && b = √c doesn't mean a = √c)

If a relation has 1, 2, and 3, we talk about Equivalence Relation (ER). If a relation only has 2 and 3, we talk about Partial Equivalence Relation (PER).

This matters in comparing floats since they include NaN. Depending on if we consider NaN = NaN true or false, we talk about ER or PER comparison respectively.

What is the type known to the compiler and library for an equality operation?

The static type known to the F# compiler is crucial to determining the performance of the operation. The runtime type of the equality check is also significant in some situations.

Here we define the relevant static type EQTYPE for the different constructs above:

Basics

Inlined constructs

These only result in naked generic equality if themselves used from a non-inlined generic context.

Non-inlined constructs always resulting in naked generic equality

These always result in naked generic equality checks.

Example 1:

let x = HashIdentity.Structural<byte>  // EQTYPE known to compiler is `byte`

Example 2 (a non-inlined "naked" generic context):

let f2<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `'T`
   // RUNTIME-EQTYPE known to the library is `byte`
   let x = HashIdentity.Structural<'T>
   ... some long code

f2<byte>() // performance of this is determined by EQTYPE<'T> and RUNTIME-EQTYPE<byte>

Example 3 (an inlined generic context):

let f3<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `byte`
   // RUNTIME-EQTYPE known to the library is `byte`
   let x = HashIdentity.Structural<'T>
   ... some long code

f3<byte>() // performance of this is determined by EQTYPE<byte> and RUNTIME-EQTYPE<byte>

Example 4 (a generic struct type in a non-inline generic context):

let f4<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `SomeStructType<'T>`
   // RUNTIME-EQTYPE known to the library is `SomeStructType<byte>`
   let x = HashIdentity.Structural<SomeStructType<'T>>
   ... some long code

f4<byte>() // performance of this determined by EQTYPE<SomeStructType<'T>> and RUNTIME-EQTYPE<SomeStructType<byte>>

How we compile equality "a = b"

This very much depends on the EQTYPE involved in the equality as known by the compiler

Aim here is to flesh these all out with: * Semantics: what semantics the user expects, and what the semantics actually is * Perf expectation: what perf the user expects * Compilation today: How we actually compile today * Perf today: What is the perf we achieve today * (Optional) sharplab.io link to how things are in whatever version is selected in sharplab * (Optional) notes

primitive integer types (int32, int64, ...)

let f (x: int) (y: int) = (x = y)

primitive floating point types (float32, float64)

let f (x: float32) (y: float32) = (x = y)

primitive string, decimal

reference tuple type (size <= 5)

reference tuple type (size > 5)

struct tuple type

C# or F# enum type

C# struct type

F# struct type (records, tuples - with compiler-generated structural equality)

array type (byte[], int[], some-struct-type[], ...)

F# large reference record/union type

Here "large" means the compiler-generated structural equality is NOT inlined.

F# tiny reference (anonymous) record or union type

Here "tiny" means the compiler-generated structural equality IS inlined.

Generic 'T in non-inlined generic code

Generic 'T in recursive position in structural comparison

This case happens in structural equality for tuple types and other structural types

Techniques available to us

  1. Flatten and inline
  2. RCG: Use reflective code generation internally in FSharp.Core
  3. KFS: Rely on known semantics of F# structural types and treat those as special
  4. TS: Hand-code type-specializations using static optimization conditions in FSharp.Core
  5. TT: Type-indexed tables of baked (poss by reflection) equality comparers and functions, where some pre-computation is done
  6. DV: De-virtualization
  7. DEQ: Use EqualityComparer<'T>.Default where possible

Notes on previous attempts to improve things

#5112

Note: this included changes to the optimizer to reduce GenericEqualityIntrinsic down to a type-indexed table lookup fetching an IEqualityComparer and calling it. These hand-coded reductions appear unnecessary as the reduction doesn't open up any further optimizations.

val x: System.Collections.Generic.IEqualityComparer<byte>
module HashIdentity from Microsoft.FSharp.Collections
val Structural<'T (requires equality)> : System.Collections.Generic.IEqualityComparer<'T> (requires equality)
Multiple items
val byte: value: 'T -> byte (requires member op_Explicit)

--------------------
type byte = System.Byte

--------------------
type byte<'Measure> = byte
val f2<'T> : unit -> obj
'T
Multiple items
val int: value: 'T -> int (requires member op_Explicit)

--------------------
type int = int32

--------------------
type int<'Measure> = int
Multiple items
val float32: value: 'T -> float32 (requires member op_Explicit)

--------------------
type float32 = System.Single

--------------------
type float32<'Measure> = float32

Type something to start searching.