Compiling Equality

This spec covers how equality is compiled and executed by the F# compiler and library, based mainly on the types involved in the equality operation after all inlining, type specialization and other optimizations have been applied.

What do we mean by an equality operation?

This spec is about the semantics and performance of the following coding constructs

a = b
a <> b

It is also about the semantics and performance of uses of the following FSharp.Core constructs which, after inlining, generate code that contains an equality check at the specific EQTYPE * HashIdentity.Structural<'T> * {Array,Seq,List}.contains * {Array,Seq,List}.countBy * {Array,Seq,List}.groupBy * {Array,Seq,List}.distinct * {Array,Seq,List}.distinctBy * {Array,Seq,List}.except

All of which have implied equality checks. Some of these operations are inlined, see below, which in turn affects the semantics and performance of the overall operation.

ER vs PER equality

In math, a (binary) relation is a way to describe a relationship between the elements of sets. "Greater than" is a relation for numbers, "Subset of" is a relation for sets.

Here we talk about 3 particular relations: 1) Reflexivity - every element is related to itself - For integers, = is reflexive (a = a is always true) and > is not (a > a is never true) 2) Symmetry - if a is related to b, then b is related to a - For integers, = is symmetric (a = b -> b = a) and > is not (if a > b then b > a is false) 3) Transitivity - if a is related to b, and b is related to c, then a is also related c - For integers, > is transitive (a > b && b > c -> a > c) and √ is not (a = √b && b = √c doesn't mean a = √c)

If a relation has 1, 2, and 3, we talk about Equivalence Relation (ER). If a relation only has 2 and 3, we talk about Partial Equivalence Relation (PER).

This matters in comparing floats since they include NaN. Depending on if we consider NaN = NaN true or false, we talk about ER or PER comparison respectively.

What is the type known to the compiler and library for an equality operation?

The static type known to the F# compiler is crucial to determining the performance of the operation. The runtime type of the equality check is also significant in some situations.

Here we define the relevant static type EQTYPE for the different constructs above:

Basics

a = b: EQTYPE is the statically known type of a or b
a <> b: EQTYPE is the statically known type of a or b

Inlined constructs

HashIdentity.Structural<'T>, EQTYPE is the inlined 'T (results in specialized equality)
Array.contains<'T>, EQTYPE is the inlined 'T (results in specialized equality)
List.contains<T> likewise
Seq.contains<T> likewise

These only result in naked generic equality if themselves used from a non-inlined generic context.

Non-inlined constructs always resulting in naked generic equality

Array.groupBy<'Key, 'T> f array, EQTYPE is non-inlined 'Key, results in naked generic equality
Array.countBy array likewise for 'T
Array.distinct<'T> array likewise
Array.distinctBy array likewise
Array.except array likewise
List.groupBy likewise
List.countBy likewise
List.distinct likewise
List.distinctBy likewise
List.except likewise
Seq.groupBy likewise
Seq.countBy likewise
Seq.distinct likewise
Seq.distinctBy likewise
Seq.except likewise

These always result in naked generic equality checks.

Example 1:

let x = HashIdentity.Structural<byte>  // EQTYPE known to compiler is `byte`

Example 2 (a non-inlined "naked" generic context):

let f2<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `'T`
   // RUNTIME-EQTYPE known to the library is `byte`
   let x = HashIdentity.Structural<'T>
   ... some long code

f2<byte>() // performance of this is determined by EQTYPE<'T> and RUNTIME-EQTYPE<byte>

Example 3 (an inlined generic context):

let f3<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `byte`
   // RUNTIME-EQTYPE known to the library is `byte`
   let x = HashIdentity.Structural<'T>
   ... some long code

f3<byte>() // performance of this is determined by EQTYPE<byte> and RUNTIME-EQTYPE<byte>

Example 4 (a generic struct type in a non-inline generic context):

let f4<'T> () =
   ... some long code
   // EQTYPE known to the compiler is `SomeStructType<'T>`
   // RUNTIME-EQTYPE known to the library is `SomeStructType<byte>`
   let x = HashIdentity.Structural<SomeStructType<'T>>
   ... some long code

f4<byte>() // performance of this determined by EQTYPE<SomeStructType<'T>> and RUNTIME-EQTYPE<SomeStructType<byte>>

How we compile equality "a = b"

This very much depends on the EQTYPE involved in the equality as known by the compiler

Aim here is to flesh these all out with: * Semantics: what semantics the user expects, and what the semantics actually is * Perf expectation: what perf the user expects * Compilation today: How we actually compile today * Perf today: What is the perf we achieve today * (Optional) sharplab.io link to how things are in whatever version is selected in sharplab * (Optional) notes

primitive integer types (`int32`, `int64`, ...)

let f (x: int) (y: int) = (x = y)

Semantics: equality on primitive
Perf: User expects full performance down to native
Compilation today: compiles to IL instruction ✅
Perf today: good ✅
sharplab int32

primitive floating point types (`float32`, `float64`)

let f (x: float32) (y: float32) = (x = y)

Semantics: IEEE floating point equality (respecting NaN etc.)
Perf: User expects full performance down to native
Compilation today: compiles to IL instruction ✅
Perf today: good ✅
sharplab float32

primitive `string`, `decimal`

Semantics: .NET equivalent equality, non-localized for strings
Perf: User expects full performance down to native
Compilation today: compiles to String.Equals or Decimal.op_Equality call ✅
Perf today: good ✅
sharplab decimal
sharplab string

reference tuple type (size <= 5)

Semantics: User expects structural
Perf: User expects flattening to constituent checks
Compilation today: tuple equality is flattened to constituent checks ✅
Perf today: good ✅
sharplab (int * double * 'T), with example reductions/optimizations noted

reference tuple type (size > 5)

Semantics: User expects structural
Perf: User expects flattening to constituent checks
Compilation today: not flattened, compiled to GenericEqualityIntrinsic
Perf today: the check does type tests, does virtual calls via IStructuralEqualityComparer, boxes etc. ❌(Problem1)
sharplab for size 6

struct tuple type

Semantics: User expects structural
Perf: User expects flattening to constituent checks or at least the same optimizations as tuples
Compilation today: compiled to GenericEqualityIntrinsic
Perf today: boxes, does type tests, does virtual calls via IStructuralEqualityComparer etc. ❌(Problem2)
sharplab for size 3

C# or F# enum type

Semantics: User expects identical to equality on the underlying type
Perf: User expects same perf as flattening to underlying type
Compilation today: flattens to underlying type
Perf today: good ✅
sharplab for C# enum int
sharplab for F# enum int

C# struct type

Semantics: User expects call to IEquatable<T> if present, but F# spec says call this.Equals(box that), in practice these are the same
Perf expected: no boxing
Compilation today: EqualityComparer.Default
Perf today: good ✅
sharplab

F# struct type (records, tuples - with compiler-generated structural equality)

Semantics: User expects field-by-field structural equality
Perf expected: no boxing
Compilation today: GenericEqualityIntrinsic<SomeStructType>
Perf today: good ✅
sharplab

array type (byte[], int[], some-struct-type[], ...)

Semantics: User expects structural
Perf expected: User expects perf is sum of constituent parts
Compilation today: either FSharpEqualityComparer_PER`1<uint8[]>::get_EqualityComparer().Equals(...) or FSharpEqualityComparer_PER`1<T[]>::get_EqualityComparer().Equals(...)
Perf today: good ✅
sharplab for byte[]

F# large reference record/union type

Here "large" means the compiler-generated structural equality is NOT inlined.

Semantics: User expects structural by default
Perf expected: User expects perf is sum of constituent parts, type-specialized if generic
Compilation today: direct call to Equals(T)
Perf today: the call to Equals(T) has specialized code but boxes fields if struct or generic, see Problem3 ❌, Problem4 ❌

F# tiny reference (anonymous) record or union type

Here "tiny" means the compiler-generated structural equality IS inlined.

Semantics: User expects structural by default
Perf expected: User expects perf is sum of constituent parts, type-specialized if generic
Compilation today: FSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
Perf today: good ✅

Generic `'T` in non-inlined generic code

Semantics: User expects the PER equality semantics of whatever 'T actually is
Perf expected: User expects no boxing
Compilation today: FSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
Perf today: good ✅

Generic `'T` in recursive position in structural comparison

This case happens in structural equality for tuple types and other structural types

Semantics: User expects the PER equality semantics of whatever 'T actually is
Perf: User expects no boxing
Compilation today: FSharpEqualityComparer_ER`1<!a>::get_EqualityComparer().Equals(...)
Perf today: good ✅
Sharplab

Techniques available to us

Flatten and inline
RCG: Use reflective code generation internally in FSharp.Core
KFS: Rely on known semantics of F# structural types and treat those as special
TS: Hand-code type-specializations using static optimization conditions in FSharp.Core
TT: Type-indexed tables of baked (poss by reflection) equality comparers and functions, where some pre-computation is done
DV: De-virtualization
DEQ: Use EqualityComparer<'T>.Default where possible

Notes on previous attempts to improve things

#5112

Uses TT, DEQ, KFS, DV
Focuses on solving Problem4
99% not breaking, apart from the case of value types with custom equality implemented differently than the EqualityComparer.Default - the change would lead to the usage of the custom implementation which is reasonable

Note: this included changes to the optimizer to reduce GenericEqualityIntrinsic down to a type-indexed table lookup fetching an IEqualityComparer and calling it. These hand-coded reductions appear unnecessary as the reduction doesn't open up any further optimizations.

val x: System.Collections.Generic.IEqualityComparer<byte>

module HashIdentity from Microsoft.FSharp.Collections

val Structural<'T (requires equality)> : System.Collections.Generic.IEqualityComparer<'T> (requires equality)

Multiple items
val byte: value: 'T -> byte (requires member op_Explicit)

--------------------
type byte = System.Byte

--------------------
type byte<'Measure> = byte

val f2<'T> : unit -> obj

Multiple items
val int: value: 'T -> int (requires member op_Explicit)

--------------------
type int = int32

--------------------
type int<'Measure> = int

Multiple items
val float32: value: 'T -> float32 (requires member op_Explicit)

--------------------
type float32 = System.Single

--------------------
type float32<'Measure> = float32