JavaScript: Closures and the Call Stack

I wanted to understand more about closures — their identity. Not what they did but how they worked and in retrospect their mechanics are obvious.

[λx.x(closure)] identity of a closure (sorry lambda calculus fun)

Review of Closures

Closures, if you don’t know, are one of the true empowering features of JavaScript. In simple terms a closure encompasses a pointer to a function implementation as well as pointers to variables. Those variables are defined external to the function in question but referenced within the body / implementation of the function. Consider this code snippet as an example

// :: int -> int -> int
const incrementBy = function(n){
// :: int -> int
return function increment(x){
return n + x

and in usage we might see something like:

const main = function() {     const incrementByOne = incrementBy(1)
// => int -> int
const eleven = incrementByOne(10)
// => 11

when we invoke the function incrementBy and pass it the value 1, it effectively returns a function (labeled increment) which is stored as incrementByOne. When we invoke incrementByOne and pass it 10 it remembers that n is bound to the value 1.

How does this work — what do the compilers and interpreters do to make this happen? John C Mitchell’s, Concepts in Computer Languages, provided the basis for my understanding. To be clear I am not stating this is how JavaScript implements closures, just how it might be implemented in the language.

Scoping Models

Scoping models determine what values are bound to which variables. There are two scoping models leveraged by different languages: Dynamic and Static (Lexical). Static scoping is the more common model and is the model employed by JavaScript.

In languages with static scope (also called lexical scope), name resolution depends on the location in the source code and the lexical context, which is defined by where the named variable or function is defined. In contrast, in languages with dynamic scope the name resolution depends upon the program state when the name is encountered which is determined by the execution context or calling context. In practice, with lexical scope a variable’s definition is resolved by searching its containing block or function, then if that fails searching the outer containing block, and so on, whereas with dynamic scope the calling function is searched, then the function which called that calling function, and so on, progressing up the call stack. [2]

Dynamic scoping binds the value to the variable by the most recent parent call in the call stack. This is the execution context. Consider this c style psuedo code [4]:

// global declaration of x set to an initial value 10 
x = 10;
// Called by g()
int f() {
return x;
// g() has its own variable named as x and calls f()
int g() {
int x = 20;
return f();
main() {
// => 10 or 20? ...well we need to know the scoping model

Many would anticipate the output as the value 10 (which would be the result in a statically scoped language). However in dynamic scoping g is the predecessor of f on the call stack and in this execution context sets the value of x to 20 prior to invoking f.

A statically scoped language would output the value 10. In static (lexical) scoping the value of a variable is defined by it’s nearest occurrence in an enclosing block (curly braces) — in this case the global declaration of x to its initial value 10.

This seemingly trivial detail impacts the data structures required to implement a closure.

Activation Records (Stack Frames)

Activation Records (also called stack frames) are records that are pushed onto the call stack when a function is invoked. They contain the information necessary to execute the function, namely:

  • parameters,
  • locally defined variables,
  • pointer the result of the invocation and
  • the point in the program to return to

This image zooms into the components of the activation record for subroutine (function) C. The other subroutines share the same structure.

Example of an activation record (stack frame)

Authors and languages will differ on the order of the components and the names of the components. For example many authors call the return address the control link. And as it turns out, depending on the language, there could be the need for additional components, which is the case with languages that feature closures.

Finding the value of non-local variables

The approach to identifying the value of variables is determined by the nearest containing block. In the example where incrementBy returns a function increment (which might be viewed as a nested subprogram) which needs a way of determining the value of the variable n which is not local to that block (meaning defined outside of that block).

In statically-scoped languages that allow nested sub-programs, a variable can be declared in one sub-program and accessed in another, nested sub-program. Such variables are “non-local” in the nested sub-program that accesses them. These variables are stored in the activation record of the sub-program that declares them. When they are used as non-local variables, that activation record is found at runtime either using access links or a display [1]

One method (the access link method) of determining the value of variable not local to the defining block/function is to modify the activation record (stack frame) to have another section , the access link, which is a pointer to the containing block. In the example the access link for increment would point to the block incrementBy. During the execution of increment the value n is determined by a recursive function that might look like this:

// :: (String, ActivationRecord) -> Object 
const getValue = function(variableName, activationRecord) {
if(activationRecord.variables[variableName]) { return activationRecord.variables[variableName] ; } return getValue(
) ;
  • look locally first, if the value is there return it
  • otherwise recurse looking

This is the type of code that one might see in a compiler or interpreter not an explicit example from a particular language. Also traversing the linked-list of activation records has some overhead, particularly if there is a need to traverse multiple links [1].

A second method that may be employed is the display method. The display method trades speed for space [1] in that it builds an array (display array) with pointers to each of the activation records. Variables could be augmented to store the depth from which they are displayed (different than but similar to the concept of De Bruijn indexing which I wrote about in my Pilgrimage to Lambda Calculus). Finding the value of a variable would be as simple as leveraging the depth of the variable and indexing the display array to get the proper activation record. However:

[With the display link method] the compiler code itself may be rather complicated (and error-prone). Sub-programs usually are not very deeply nested, so the runtime considerations may not be very important. [1]

The access link and display methods drive to the same result — find the nearest activation record containing the referenced identifier. The access link method requires the runtime to walk the linked list of access links and the display link is array access, but they both result in the right value.

There you have it. While this might not make a practical impact on my use of closures in my code, the journey served as a reminder of the many decisions faced in language design and compiler and interpreter writing. I strongly recommend John Mitchell’s stuff — “Concepts” is a history of decisions made in language design and implementation. Much of which is quite accessible. Cheers!

A 25 year software industry veteran with a passion for functional programming, architecture, mentoring / team development, xp/agile and doing the right thing.