F# Compiler Guide


Debug emit

The F# compiler code base emits debug information and attributes. This article documents what we do, how it is implemented and the problem areas in our implementation.

There are mistakes and missing pieces to our debug information. Small improvements can make a major difference. Please help us fix mistakes and get things right.

The file tests\walkthroughs\DebugStepping\TheBigFileOfDebugStepping.fsx is crucial for testing the stepping experience for a range of constructs.

User experiences

Debugging information affects numerous user experiences:

Some experiences are un-implemented by F# including:

Emitted information

Emitted debug information includes:

We almost always now emit the Portable PDB format.

Design-time services

IDE tooling performs queries into the F# language service, notably:

Debugging and optimization

Nearly all optimizations are off when debug code is being generated.

Otherwise, what comes out of the type checker is pretty much what goes into IlxGen.fs.

Debug points

Terminology

We use the terms "sequence point" and "debug point" interchangeably. The word "sequence" has too many meanings in the F# compiler so in the actual code you'll see "DebugPoint" more often, though for abbreviations you may see spFoo or mFoo.

How breakpoints work (high level)

Breakpoints have two existences which must give matching behavior:

This means there is an invariant that ValidateBreakpointLocation and the emitted IL debug points correspond.

NOTE: The IL code can and does contain extra debug points that don't pass ValidateBreakpointLocation. It won't be possible to set a breakpoint for these, but they will appear in stepping.

Intended debug points based on syntax

The intended debug points for constructs are determined by syntax as follows. Processing depends on whether a construct is being processed as "control-flow" or not. This means at least one debug point will be placed, either over the whole expression or some of its parts.

Construct

Debug points

let x = leaf-expr in BODY-EXPR

Debug point over let x = leaf-expr.

let x = NON-LEAF-EXPR in BODY-EXPR

let f x = BODY-EXPR in BODY-EXPR

let rec f x = BODY-EXPR and g x = BODY-EXPR in BODY-EXPR

if guard-expr then THEN-EXPR

Debug point over if guard-expr then

if guard-expr then THEN-EXPR else ELSE-EXPR

Debug point over if .. then

match .. with ...

Debug point over match .. with

... -> TARGET-EXPR

... when WHEN-EXPR -> TARGET-EXPR

while .. do BODY-EXPR

Debug point over while .. do

for .. in collection-expr do BODY-EXPR

Debug points over for, in and collection-expr

try TRY-EXPR with .. -> HANDLER-EXPR

Debug points over try and with

try TRY-EXPR finally .. -> FINALLY-EXPR

Debug points try and finally

use x = leaf-expr in BODY-EXPR

Debug point over use x = leaf-expr.

use x = NON-LEAF-EXPR in BODY-EXPR

EXPR; EXPR

(fun .. -> BODY-EXPR)

Not a leaf, do not produce a debug point on outer expression, but include them on BODY-EXPR

{ new C(args) with member ... = BODY-EXPR }

Pipe EXPR1 && EXPR2

Pipe EXPR1 || EXPR2

Pipe EXPR1 |> EXPR2

Pipe (EXPR1, EXPR2) ||> EXPR3

Pipe (EXPR1, EXPR2, EXPR3) |||> EXPR4

yield leaf-expr

Debug point over 'yield expr'

yield! leaf-expr

Debug point over 'yield! expr'

return leaf-expr

Debug point over 'return expr'

return! leaf-expr

Debug point over 'return! expr'

[ BODY ]

See notes below. If a computed list expression with yields (explicit or implicit) then process as control-flow. Otherwise treat as leaf

[| BODY |]

See notes below. If a computed list expression with yields (explicit or implicit) then process as control-flow. Otherwise treat as leaf

seq { BODY }

See notes below

builder { BODY }

See notes below

f expr, new C(args), constants or other leaf

Debug point when being processed as control-flow. The sub-expressions are processed as non-control-flow.

Intended debug points for let-bindings

Simple let bindings get debug points that extend over the let (if the thing is not a function and the implementation is a leaf expression):

let f () =
    let x = 1 // debug point for whole of `let x = 1`
    let f x = 1 // no debug point on `let f x =`, debug point on `1`
    let x = if today then 1 else tomorrow // no debug point on `let x =`, debug point on `if today then` and `1` and `tomorrow`
    let x = let y = 1 in y + y // no debug point on `let x =`, debug point on `let y = 1` and `y + y`
    ...

Intended debug points for nested control-flow

Debug points are not generally emitted for constituent parts of non-leaf constructs, in particular function applications, e.g. consider:

let h1 x = g (f x)
let h2 x = x |> f |> g

Here g (f x) gets one debug point covering the whole expression. The corresponding pipelining gets three debug points.

If however a nested expression is control-flow, then debug points start being emitted again e.g.

let h3 x = f (if today then 1 else 2)

Here debug points are at if today then and 1 and 2 and all of f (if today then 1 else 2)

NOTE: these debug points are overlapping. That's life.

Intended debug points for [...], [| ... |] code

The intended debug points for computed list and array expressions are the same as for the expressions inside the constructs. For example

let x = [ for i in 1 .. 10 do yield 1 ]

This will have debug points on for i in 1 .. 10 do and yield 1.

Intended debug points for seq { .. } and task { .. } code

The intended debug points for tasks is the same as for the expressions inside the constructs. For example

let f() = task { for i in 1 .. 10 do printfn "hello" }

This will have debug points on for i in 1 .. 10 do and printfn "hello".

NOTE: there are glitches, see further below

Intended debug points for other computation expressions

Other computation expressions such as async { .. } or builder { ... } get debug points as follows:

The computations are often "cold-start" anyway, leading to a two-phase debug problem.

The "step-into" and "step-over" behaviour for computation expressions is often buggy because it is performed with respect to the de-sugaring and inlining rather than the original source. For example, a "step over" on a "while" with a non-inlined builder.While will step over the whole call, when the user expects it to step the loop. One approach is to inline the builder.While method, and apply [<InlineIfLambda>] to the body function. This however has only limited success as at some points inlining fails to fully flatten. Builders implemented with resumable code tend to be much better in this regards as more complete inlining and code-flattening is applied.

Intended debug points for implicit constructors

e.g.

type C(args) =        
    let x = 1+1         // debug point over `let x = 1+1` as the only side effect
    let f x = x + 1
    member _.P = x + f 4

type C(args) =        
    do printfn "hello"         // debug point over `printfn "hello"` as side effect
    static do printfn "hello"         // debug point over `printfn "hello"` as side effect for static init
    let f x = x + 1
    member _.P = x + f 4

type C(args) =        // debug point over `(args)` since there's no other place to stop on object construction
    let f x = x + 1
    member _.P = 4

Internal implementation of debug points in the compiler

Most (but not all) debug points are noted by the parser by adding DebugPointAtTry, DebugPointAtWith, DebugPointAtFinally, DebugPointAtFor, DebugPointAtWhile, DebugPointAtBinding or DebugPointAtLeaf.

These are then used by ValidateBreakpointLocation. These same values are also propagated unchanged all the way through to IlxGen.fs for actual code generation, and used for IL emit, e.g. a simple case like this:

    match spTry with
    | DebugPointAtTry.Yes m -> CG.EmitDebugPoint cgbuf m ... 
    | DebugPointAtTry.No -> ...
    ...

For many constructs this is adequate. However, in practice the situation is far more complicated.

Internals: Debug points for [...], [| ... |]

The internal implementation of debug points for list and array expressions is conceptually simple but a little complex.

Conceptually the task is easy, e.g. [ while check() do yield x + x ] is lowered to code like this:

let $collector = ListCollector<int>()
while check() do
    $collector.Add(x+x)
$collector.Close()

Note the while loop is still a while loop - no magic here - and the debug points for the while loop can also apply to the actual generated for loop.

However, the actual implementation is more complicated because there is a TypedTree representation of the code in-between that at first seems to bear little resemblance to what comes in.

SyntaxTree --[CheckComputationExpressions.fs]--> TypedTree --> IlxGen -->[LowerComputedListOrArrayExpr.fs]--> IlxGen

The TypedTree is a functional encoding into Seq.toList, Seq.singleton and so on. How do the debug points get propagated?

This then gives accurate debug points for these constructs.

Internals: debug points for seq { .. .} code

Debug points for seq { .. } compiling to state machines poses similar problems.

Internals: debug points for task { .. .} code

Debug points for task { .. } poses much harder problems. We use "while" loops as an example:

Internals: debug points for other computation expressions

As mentioned above, other computation expressions such as async { .. } have significant problems with their debug points.

The main problem is stepping: even after inlining the code for computation expressions is rarely "flattened" enough, so, for example, a "step-into" is required to get into the second part of an expr1; expr2 construct (i.e. an async.Combine(..., async.Delay(fun () -> ...))) where the user expects to press "step-over".

Breakpoints tend to be less problematic.

NOTE: A systematic solution for quality debugging of computation expressions code is still elusive, and especially for async { ... }. Extensive use of inlining and InlineIfLambda can succeed in flattening most simple computation expression code. This is however not yet fully applied to async programming. NOTE: The use of library code to implement "async" and similar computation expressions also interacts badly with "Just My Code" debugging, see https://github.com/dotnet/fsharp/issues/5539 for example. NOTE: As mentioned, the use of many functions to implement "async" and friends implements badly with "Step Into" and "Step Over" and related attributes, see for example https://github.com/dotnet/fsharp/issues/3359

FeeFee and F00F00 debug points (Hidden and JustMyCodeWithNoSource)

Some fragments of code use constructs generate calls and other IL code that should not have debug points and not participate in "Step Into", for example. These are generated in IlxGen as "FeeFee" debug points. See the the Portable PDB spec linked here.

TODO: There is also the future prospect of generating JustMyCodeWithNoSource (0xF00F00) debug points but these are not yet emitted by F#. We should check what this is and when the C# compiler emits these. NOTE: We always make space for a debug point at the head of each method by emitting a FeeFee debug sequence point. This may be immediately replaced by a "real" debug point here.

Generated code

The F# compiler generates entire IL classes and methods for constructs such as records, closures, state machines and so on. Each time code is generated we must carefully consider what attributes and debug points are generated.

Generated "augment" methods for records, unions and structs

Generated methods for equality, hash and comparison on records, unions and structs do not get debug points at all.

NOTE: Methods without debug points (or with only 0xFEEFEE debug points) are shown as "no code available" in Visual Studio - or in Just My Code they are hidden altogether - and are removed from profiling traces (in profiling, their costs are added to the cost of the calling method). TODO: we should also consider emitting ExcludeFromCodeCoverageAttribute, being assessed at time of writing, however the absence of debug points should be sufficient to exclude these.

Generated "New", "Is", "Tag" etc. for unions

Discriminated unions generate NewXYZ, IsXYZ, Tag etc. members. These do not get debug points at all.

These methods also get CompilerGeneratedAttribute, and DebuggerNonUserCodeAttribute.

TODO: we should also consider emitting ExcludeFromCodeCoverageAttribute, being assessed at time of writing, however the absence of debug points should be sufficient to exclude these. TODO: the NewABC methods are missing CompilerGeneratedAttribute, and DebuggerNonUserCodeAttribute. However, the absence of debug points should be sufficient to exclude these from code coverage and profiling.

Generated closures for lambdas

The debug codegen involved in closures is as follows:

Source

Construct

Debug Points

Attributes

(fun x -> ...)

Closure class

.ctor method

none

CompilerGenerated, DebuggerNonUserCode

Invoke method

from body of closure

generic local defn

Closure class

.ctor method

none

CompilerGenerated, DebuggerNonUserCode

Specialize method

from body of closure

Intermediate closure classes

For long curried closures fun a b c d e f -> ....

CompilerGenerated, DebuggerNonUserCode

Generated intermediate closure methods do not get debug points, and are labelled CompilerGenerated and DebuggerNonUserCode.

TODO: we should also consider emitting ExcludeFromCodeCoverageAttribute, being assessed at time of writing

Generated state machines for seq { .. }

Sequence expressions generate class implementations which resemble closures.

The debug points recovered for the generated state machine code for seq { ... } is covered up above. The other codegen is as follows:

Source

Construct

Debug Points

Attributes

seq { ... }

State machine class

"Closure"

.ctor method

none

none

GetFreshEnumerator

none

CompilerGenerated, DebuggerNonUserCode

LastGenerated

none

CompilerGenerated, DebuggerNonUserCode

Close

none

none

get_CheckClose

none

none

GenerateNext

from desugaring

none

NOTE: it appears from the code that extraneous debug points are not being generated, which is good, though should be checked TODO: we should likely be generating CompilerGeneratedAttribute and DebuggerNonUserCodeAttribute attributes for the Close and get_CheckClose and .ctor methods TODO: we should also consider emitting ExcludeFromCodeCoverageAttribute, being assessed at time of writing

Generated state machines for task { .. }

Resumable state machines used for task { .. } also generate struct implementations which resemble closures.

The debug points recovered for the generated state machine code for seq { ... } is covered up above. The other codegen is as follows:

Source

Construct

Debug Points

Attributes

Notes

task { ... }

State machine struct

"Closure"

.ctor method

none

none

TBD

TODO: we should be generating attributes for some of these TODO: we should assess that only the "MoveNext" method gets any debug points at all TODO: Currently stepping into a task-returning method needs a second step-into to get into the MoveNext method of the state machine. We should emit the StateMachineMethod and StateMachineHoistedLocalScopes tables into the PDB to get better debugging into task methods. See https://github.com/dotnet/fsharp/issues/12000.

Generated code for delegate constructions Func<int,int,int>(fun x y -> x + y)

A closure class is generated. Consider the code

open System
let d = Func<int,int,int>(fun x y -> x + y)

There is one debug point over all of Func<int,int,int>(fun x y -> x + y) and one over x+y.

Generated code for constant-sized array and list expressions

These are not generally problematic for debug.

Generated code for large constant arrays

These are not generally problematic for debug.

Generated code for pattern matching

The implementation is a little gnarly and complicated and has historically had glitches.

Generated code for conditionals and boolean logic

Generally straight-forward. See for example this proposed feature improvement

Capture and closures

Captured locals are available via the this pointer of the immediate closure. Un-captured locals are not available as things stand. See for example this proposed feature improvement.

Consider this code:

let F() =
    let x = 1
    let y = 2
    (fun () -> x + y)

Here x and y become closure fields of the closure class generated for the final lambda. When inspecting locals in the inner closure, the C# expression evaluator we rely on for Visual Studio takes local names like x and y and is happy to look them up via this. This means hovering over x correctly produces the value stored in this.x.

For nested closures, values are implicitly re-captured, and again the captured locals will be available.

However this doesn't work with "capture" from a class-defined "let" context. Consider the following variation:

type C() =
    let x = 1
    member _.M() = 
        let y = 2
        (fun () -> x + y)

Here the implicitly captured local is y, but x is not captured, instead it is implicitly rewritten by the F# compiler to c.x where c is the captured outer "this" pointer of the invocation of M(). This means that hovering over x does not produce a value. See issue 3759.

Provided code

Code provided by erasing type providers has all debugging points removed. It isn't possible to step into such code or if there are implicit debug points they will be the same range as the construct that was macro-expanded by the code erasure.

For example, a provided if/then/else expression has no debug point

Added code generation for better debugging

We do some "extra" code gen to improve debugging. It is likely much of this could be removed if we had an expression evaluator for F#.

'this' value

For member x.Foo() = ... the implementation of the member adds a local variable x containing the this pointer from ldarg.0. This means hovering over x in the method produces the right value, as does x.Property etc.

Pipeline debugging

For pipeline debugging we emit extra locals for each stage of a pipe and debug points at each stage.

See pipeline debugging mini-spec.

Shadowed locals

For shadowed locals we change the name of a local for the scope for which it is shadowed.

See shadowed locals mini-spec.

Discriminated union debug display text

For discriminated union types and all implied subtypes we emit a DebuggerDisplayAttrubte and a private __DebugDisplay() method that uses sprintf "%+0.8A" obj to format the object.

Missing debug emit

Missing debug emit for PDBs

Our PDB emit is missing considerable information:

These are major holes in the F# experience. Some are required for things like hot-reload.

Missing design-time services

Some design-time services are un-implemented by F#:

These are major holes in the F# experience and should be implemented.

val f : unit -> 'a
val x : int
val f : ('b -> int)
val x : 'b
val y : int
val h1 : x:'a -> 'b
val x : 'a
val h2 : x:unit -> 'a
val x : unit
val h3 : x:'a -> 'b
val x : int list
val i : int
val printfn : format:Printf.TextWriterFormat<'T> -> 'T
<summary>Print to <c>stdout</c> using the given format, and add a newline.</summary>
<param name="format">The formatter.</param>
<returns>The formatted result.</returns>
type C = new : args:obj -> C member P : int
val args : obj
val f : (int -> int)
member C.P : int
Multiple items
type C = new : args:obj -> C member P : int

--------------------
new : args:obj -> C
Multiple items
val int : value:'T -> int (requires member op_Explicit)
<summary>Converts the argument to signed 32-bit integer. This is a direct conversion for all primitive numeric types. For strings, the input is converted using <c>Int32.Parse()</c> with InvariantCulture settings. Otherwise the operation requires an appropriate static conversion method on the input type.</summary>
<param name="value">The input value.</param>
<returns>The converted int</returns>


--------------------
[<Struct>] type int = int32
<summary>An abbreviation for the CLI type <see cref="T:System.Int32" />.</summary>
<category>Basic Types</category>


--------------------
type int<'Measure> = int
<summary>The type of 32-bit signed integer numbers, annotated with a unit of measure. The unit of measure is erased in compiled code and when values of this type are analyzed using reflection. The type is representationally equivalent to <see cref="T:System.Int32" />.</summary>
<category>Basic Types with Units of Measure</category>
namespace System