There are several artifacts involved in the development of F#:
FSharp.Compiler.Service (docs, source). Contains all logic for F# compilation - including parsing, syntax tree processing, typechecking, constraint solving, optimizations, IL importing, IL writing, pretty printing of F# constructs, and F# metadata format processing - and the F# compiler APIs for tooling.
The F# compiler executable, called
fsc, which is called as a console app. It sets the .NET GC into batch mode and then invokes
FSharp.Compiler.Service with command-line arguments.
The FSharp.Core Library, called
FSharp.Core. Contains all primitive F# types and logic for how they interact, core data structures and library functions for operating on them, structured printing logic, units of measure for scientific programming, core numeric functionality, F# quotations, F# type reflection logic, and asynchronous programming types and logic.
The F# Interactive tool, called
fsi. A REPL for F# that supports execution and pretty-printing of F# code and results, loading F# script files, referencing assemblies, and referencing packages from NuGet.
FSharp.Compiler.Service is by far the largest of these components and contains nearly all logic that
fsi use. It is the primary subject of this guide.
The following are the key data formats and internal data representations of the F# compiler code in its various configurations:
Input source files Read as Unicode text, or binary for referenced assemblies.
Input command-line arguments See CompilerOptions.fs for the full code implementing the arguments table. Command-line arguments are also accepted by the F# Compiler Service API in project specifications, and as optional input to F# Interactive.
Abstract Syntax Tree (AST), see SyntaxTree.fs, the untyped syntax tree resulting from parsing.
Typed Abstract Syntax Tree (Typed Tree), see TypedTree.fs, TypedTreeBasics.fs, TypedTreeOps.fs, and related files. The typed, bound syntax tree including both type/module definitions and their backing expressions, resulting from type checking and the subject of successive phases of optimization and representation change.
Type checking context/state, see for example
TcState in ParseAndCheckInputs.fsi and its constituent parts, particularly
TcEnv in CheckExpressions.fsi and
NameResolutionEnv in NameResolution.fsi. A set of tables representing the available names, assemblies etc. in scope during type checking, plus associated information.
Abstract IL, the output of code generation, then used for binary generation, and the input format when reading .NET assemblies, see
ILModuleDef in il.fsi.
The .NET Binary format (with added "pickled" F# Metadata resource), the final output of fsc.exe, see the ECMA 335 specification and the ilread.fs and ilwrite.fs binary reader/generator implementations. The added F# metadata is stored in a binary resource, see TypedTreePickle.fs.
The incrementally emitted .NET reflection assembly, the incremental output of fsi.exe. See ilreflect.fs.
The following are the most relevant parts of the F# compiler tooling, making up the "engine" and API surface area of
The F# Compiler Service Caches, the various caches maintained by an instance of an
The following is a diagram of how the different phases of the F# compiler work:
The following are the key phases and high-level logical operations of the F# compiler code in its various configurations:
Basic lexing. Produces a token stream from input source file text. F# uses the FsLex tool to process a declarative specification of the tokenizer in lex.fsl. This compiles the tokenizer specification to a number of tables which are then interpreted by the code in prim-lexing.fs (see also prim-lexing.fsi.
White-space sensitive lexing. Accepts and produces a token stream, augmenting per the F# Language Specification.
Parsing. Accepts a token stream and produces an AST per the grammar in the F# Language Specification. F# uses the FsYacc tool to process a declarative specification of the parser in pars.fsy. This compiles the grammar to a number of tables which are then interpreted by the code in prim-parsing.fs (see also prim-parsing.fsi.
Resolving references. For .NET SDK generally references are resolved explicitly by external tooling.
There is a legacy aspect to this if references use old .NET Framework references including for
scripting. See ReferenceResolver.fs for the abstract definition of compiler reference resolution. See LegacyMSBuildReferenceResolver.fs for reference resolution used by the .NET Framework F# compiler when running on .NET Framework. See SimulatedMSBuildReferenceResolver.fs when not using the .NET Framework F# compiler.
See DependencyManager for reference resolution and package management used in
Importing referenced .NET binaries, see import.fsi/import.fs. Accepts file references and produces a Typed Tree node for each referenced assembly, including information about its type definitions (and type forwarders if any).
Importing referenced F# binaries and optimization information as Typed Tree data structures, see TypedTreePickle.fs. Accepts binary data and produces Typed Tree nodes for each referenced assembly, including information about its type/module/function/member definitions.
Sequentially type checking files, see CheckDeclarations.fsi/CheckDeclarations.fs. Accepts an AST plus a type checking context/state and produces new Typed Tree nodes
incorporated into an updated type checking state, plus additional Typed Tree Expression nodes used during code generation. A key part of this is
checking syntactic types and expressions, see CheckExpressions.fsi/CheckExpressions.fs including the state held across the checking of a file (see
TcFileState) and the
environment active as we traverse declarations and expressions (see
Pattern match compilation, see PatternMatchCompilation.fsi/PatternMatchCompilation.fs. Accepts a subset of checked Typed Tree nodes representing F# pattern matching and produces Typed Tree expressions implementing the pattern matching. Called during type checking as each construct involving pattern matching is processed.
Constraint solving, see ConstraintSolver.fsi/ConstraintSolver.fs. A constraint solver state is maintained during type checking of a single file, and constraints are progressively asserted (i.e. added to this state). Fresh inference variables are generated and variables are eliminated (solved). Variables are also generalized at various language constructs, or explicitly declared, making them "rigid". Called during type checking as each construct is processed.
Post-inference type checks, see PostInferenceChecks.fsi/PostInferenceChecks.fs. Called at the end of type checking/inference for each file. A range of checks that can only be enforced after type checking on a file is complete, such as analysis when using
byref<'T> or other
Quotation translation, see QuotationTranslator.fsi/QuotationTranslator.fs/QuotationPickler.fsi/QuotationPickler.fs. Generates the stored information for F# quotation nodes, generated from the Typed Tree expression structures of the F# compiler. Quotations are ultimately stored as binary data plus some added type references. "ReflectedDefinition" quotations are collected and stored in a single blob.
Optimization phases, primarily the "Optimize" (peephole/inlining) and "Top Level Representation" (lambda lifting) phases, see Optimizer.fsi/Optimizer.fs and InnerLambdasToTopLevelFuncs.fsi/InnerLambdasToTopLevelFuncs.fs and LowerCalls.fs. Each of these takes Typed Tree nodes for types and expressions and either modifies the nodes in place or produces new Typed Tree nodes. These phases are orchestrated in CompilerOptions.fs
Reflection-Emit, see ilreflect.fs.
These and transformations used to build the following:
The F# Interactive Shell, see fsi.fs.
The F# compiler is bootstrapped. That is, an existing F# compiler is used to build a "proto" compiler from the current source code. That "proto" compiler is then used to compile itself, producing a "final" compiler. This ensures the final compiler is compiled with all relevant optimizations and fixes.
Microsoft.FSharp.targets give MSBuild support for F# projects (
.fsproj) and contain the targets. Although not strictly part of the F# compiler, they are essential for using F# in all contexts for .NET, aside from some more targeted scripting scenarios. The targets expose things like the
Fsc tasks called by MSBuild.