Overall memory usage is a primary determinant of the usability of the F# compiler and instances of the F# compiler service.
Overly high memory usage results in poor throughput (particularly due to increased GC times) and low user interface responsivity in tools such as Visual Studio or other editing environments. In some extreme cases, it can lead to Visual Studio crashing or another IDE becoming unusable due to constant paging from absurdly high memory usage. Luckily, these extreme cases are very rare.
When you do a single compilation to produce a binary, memory usage typically doesn't matter much. It's often fine to allocate a lot of memory because it will just be reclaimed after compilation is over.
However, the F# compiler is not simply a batch process that accepts source code as input and produces an assembly as output. When you consider the needs of editor and project tooling in IDEs, the F# compiler is:
Thinking about the F# compiler in these ways makes performance far more complicated than just throughput of a batch compilation process.
In general, the F# compiler allocates a lot of memory. More than it needs to. However, most of the "easy" sources of allocations have been squashed out and what remains are many smaller sources of allocations. The remaining "big" pieces allocate as a result of their current architecture, so it isn't straightforward to address them.
Some allocations are much more than others Large Object Heap (LOH) allocations (> ~80K) are rarely collected and should only be used for long-lived items. Ephemeral allocations that never escape the Gen0 seem to not matter that much, though of course should be considered. * Don't try to remove all allocations, and don't assume copying of large structs is better than allocating a reference type. Measure instead.
To analyze memory usage of F# tooling, you have two primary avenues:
To analyze memory usage of the F# compiler itself:
extract the compilation arguments out of msbuild output (or in the output pane of Visual Studio)
put this content in a "response file" (a text file listing compiler arguments, one per line)
* use the memory profiler tool of choice, invoking the compiler (either fsc.exe, or through
dotnet path/to/fsc.dll) giving it the argument
@name-of-response-file, and setting the directory of the project that is compiled as
Process dump files are extremely information-rich data files that can be used to see the distribution of memory usage across various types. Tools like dotMemory will show these distributions and intelligently group things to help identify the biggest areas worth improving. Additionally, they will notice things like duplicate strings and sparse arrays, which are often great ways to improve memory usage since it means more memory is being used than is necessary.
When analyzing a trace, there are a few things to look out for:
LargeObjectshowing up anywhere prominently? If so, that's a problem! b. Which objects show up highest on the list? Does their presence that high make sense? c. For a type such as
System.String, which caller allocates it the most? Can that be improved?
After analyzing a trace, you should have a good idea of places that could see improvement. Often times a tuple can be made into a struct tuple, or some convenient string processing could be adjusted to use a
ReadonlySpan<'T> or turned into a more verbose loop that avoids allocations.