Memory Management and Garbage Collection

In this section we discuss the general concepts of memory management and garbage collection.

Memory Management

One key characteristic of our store-based interpreter so far is that the store (heap) keeps growing in size, never releasing unnecessary memory locations. This will cause us to run out of memory sooner or later. We need a mechanism by which we can release no longer needed “locations”. In other words we need a way to “free up” memory.

Freeing up memory comes in two pieces: The one is actually removing unneeded entries from the store. The other is to reuse memory locations that are no longer in use.

There are broadly speaking two kinds of approaches: manual memory management and automatic memory management, affectionately called garbage collection.

In either case, the two main tools for the task are some forms of malloc and free. malloc allocates a given amount of memory, while free deallocates the memory pointed to by a reference. The difference between the two systems is whether the programmer uses those or whether the system uses those and does not expose them to the programmer.

In programming languages like C with manual memory management the programmer is responsible for calling malloc and free.

malloc and free

Let us discuss what needs to happen for memory allocation and freeing.

In practice we actually maintain multiple lists of freed locations, each with different standard sizes, so that when a new memory location is needed we can choose one of suitable size (and if none is available we can split a larger location in two smaller ones.

An interesting question arises as to how we keep track of what memory locations are available, literally where we store that information? It turns out that we do it in the freed locations themselves! We turn them into a linked list, each location containing as its “value” a pointer to the next location. When a location is freed, its contents are replaced with a link to the first free location of that size, and the newly freed location becomes the start of the linked list. When a location is to be allocated, it is taken from the front of the list.

A simplistic implementation of all this, using our locations infrastructure, could be as follows:

(define free-list null)
(define (free loc)
  (set! free-list (cons loc free-list)))
(define (malloc)
  (if (null? free-list)
      (new-loc)
      (let ([loc (car free-list)])
        (begin (set! free-list (cdr free-list))
               loc))))

Reference counts

Manual memory allocation puts an undue burden on the programmer. A common question that programmers need to think about in C and C++ is the question of whether they are the ones responsible for freeing a particular memory location, or whether some library or framework is responsible instead. This becomes part of the documentation of that library/framework, but it is still a considerable amount of mental overhead.

One approach to help with this is reference counting. The idea is that each reference/location has a count associated with it. Each time that a new link to that location is formed, the count is incremented. Each time a link is broken, the count is decremented. If the count reaches 0, that means there are no links left to that memory location and that therefore the location can be freed.

Let us imagine for example the following OCAML code snippet:

let r1 = ref 0 in
let r2 = r1
in r2

One tricky bit that reference counting isn’t handling very well is cycles and self references. For instance consider the following code:

(let ([b (box 0)]
      [c (box b)])
  (begin (set-box! b c)  ;; boxes b and c refer to each other
         10))

Here the box b contains a link to the box c and vice versa. Now let us think of the reference count. Box b is created and contains 0 to begin with. The corresponding location has a reference count of 1. Similarly box c is created and the corresponding location has a reference count of 1. The value contained in this box c is a reference to the location stored in box b, so that location must now have a reference count of 2, as we just established a new reference to it. Lastly, we put the value of c in box b, thus incrementing the reference count of the location in box b to 2.

Then we return. At this point the b and c variables go out of scope, hence the corresponding reference counts are decremented by 1. But this still leaves the reference counts at 1, non-zero, because the two locations still have references to each other! What is worse, we no longer have access to them and have no way to influence them. So this is a cycle in memory that will never be cleared by reference counting.

The inability to recover from reference cycles is one of the main drawbacks of the reference-counting approach.

Another somewhat related approach, popular with C++, is called RAII. You can learn a lot about it, and its benefits and limitations, in many online resources.

Garbage Collection

Ideal programmers would do just fine with manual memory management. But ideal programmers, that never make mistakes, are rare. Also the mental overhead required to maintain proper memory management is unnecessary. Many systems therefore, including Java as well as most functional programming languages, have transitioned to various automated memory management systems, affectionately called “garbage collection”. We will discuss some of these systems in this section.

The general idea in garbage collection is simple:

In some instances, the garbage collector can run on a separate thread, without interfering with the normal program execution, but these cases are the exception. In most languages, normal program execution must stall while the garbage collection process is underway.

This is a main drawback of garbage-collected systems: The program execution stops at unpredictable times, and for an unpredictable amount of time. In certain domains with real-time constraints, this is simply not acceptable.

Finding non-free locations

The first difficulty with garbage collection is how to determine which memory is free and which memory is not free. We have here a similar classification as for type-checking:

As before, we have a familiar problem: It is not possible to have an automated memory management system that is guaranteed to terminate, and that is both sound and complete. And as before, soundness is much more important than completeness: An unsound system is likely to corrupt a memory location that will be used, and when that location is used then unpredictable behavior will happen. In other words the system will on occasion choose not to free up a memory location that ends up never being used, because it could not determine that it will not end up being used. A garbage collection system is therefore very conservative.

The main issue is how the system determines what memory locations may be used and what memory locations are definitely not going to be used. So we want to determine which memory locations are reachable. The main idea is the following:

The function closures show us why the general problem of determining precisely which memory locations to free is undecidable. There are locations in the function closure, and they will only be used if the function is called, and possibly not even then. We cannot determine if the function will be called at some time in the future, therefore we have to be conservative about those memory locations.

The above process though describes how we find which locations are reachable. The question still remains how precisely we will “free” the rest. There are lots of different approaches we could take, and we will discuss those in the subsequent sections.

Mark and Sweep

The most straightforward algorithm is called “mark and sweep”:

In terms of our implementation, we could imagine all memory locations carrying a flag, so they would be represented as a mutable cons cell where one entry is a boolean while the other entry is the integer that we increment. Then the mark and sweep algorithm simply uses the first entry to mark the locations that should not be removed, then frees all others.

The mark and sweep algorithm is probably the simplest one to implement. It has however one main drawback: Its runtime is proportional to the size of the memory that has been allocated, and not to the size of the reachable part of the memory. If most of the allocated memory is to be freed, then this algorithm is going to have to do a whole lot of work that some of the algorithms we will describe next can avoid. For instance if there have been a total of 2GB of memory allocated so far, and only 10MB are still reachable with the remaining 1990MB primed for freeing, the mark and sweep algorithm still has to scan the entire 2GB, rather than only the 10MB. This can be particularly bad for languages that create a lot of “intermediate” values, values that were used along the way but can then be released.Functional programming languages for instance belong to this category.

Copying algorithms

The copying algorithm aims to reduce the time it takes to release memory when only a small part of it is actually reachable. It works as follows:

This way we never have to manually go through every spot in the active part to see if it is available or not. Once we have copied all the reachable locations, everything in the active part is effectively free. When that part becomes active again (it will go through an inactive phase first) all these locations will be available.

There is a variation of this algorithm, often called “mark and compact”, that tries to do the copying in-place, essentially attempting to defragment the memory.

Generational Garbage Collection

Generational garbage collection algorithms are based on what is known as the generational hypothesis, that essentially says the following:

The most recently created objects are those most likely to become unreachable. Objects that have survived some GC cycles are likely to remain reachable.

This is especially the case for functional programming languages, where the heavy use of higher-order functions like map and filter results in numerous ephemeral results.

This suggests a garbage collection technique that is similar to the copying algorithms, but further divides the objects in 2 or more “generations”.

The idea is that if an object has been reachable for some time now, then we should expect it to be reachable in the future, so there’s no reason to keep checking. We check most frequently on objects that are more recent, for which we still don’t know if they’re “here to stay” or not. Older objects don’t need our attention. This way GC cycles can run a lot faster, as they only work on a smaller part of the memory.