The Tradeoffs of ByRef and ByVal

As they say, there is no free lunch. Passing by value or by reference is the sort of distinction you can mostly ignore...until one day you can't.

The Tradeoffs of ByRef and ByVal

There are two modes of passing parameters in VBA: by reference or by value.  If you don't specify a mode, VBA's default behavior is to pass parameters by reference.  For a quick primer on the differences between the two modes, check out my article, One if ByRef, Two if ByVal.*  

(* On a side note, I just changed the title of the linked article–I flipped ByRef and ByVal–to better reflect that passing by reference involves only a single copy of the variable's value in memory, while passing by value creates a second copy of the value in memory.)

The Case for Passing By Value

The Rubberduck VBA project provides several "code inspections" that you can run against your code to find and identify potential problems.  One such code inspection is ParameterCanBeByVal:

ParameterCanBeByVal

Flags parameters that are passed by reference (ByRef), but could be passed by value (ByVal).

Reasoning

Explicitly specifying a ByVal modifier on a parameter makes the intent explicit: this parameter is not meant to be assigned. In contrast, a parameter that is passed by reference (implicitly, or explicitly ByRef) makes it ambiguous from the calling code's standpoint, whether the procedure might re-assign these ByRef values and introduce a bug.

This is a good bit of defensive programming, but there's no free lunch here.  There is a memory hit to this approach.  When you pass an argument by value, VBA has to create an entire copy of the variable in memory.  If you will only ever be reading this variable, then the additional memory allocation is a waste of resources.

The Case for Passing By Reference

There is a reason the VBA language passes parameters by reference as its default behavior; it's more efficient (emphasis added):

Passing Arguments Efficiently

All arguments are passed to procedures by reference, unless you specify otherwise. This is efficient because all arguments passed by reference take the same amount of time to pass and the same amount of space (4 bytes) within a procedure regardless of the argument's data type.

You can pass an argument by value if you include the ByVal keyword in the procedure's declaration. Arguments passed by value consume from 2–16 bytes within the procedure, depending on the argument's data type. Larger data types take slightly longer to pass by value than smaller ones. Because of this, String and Variant data types generally should not be passed by value.

Passing by value can impact performance in two distinct ways:

  • Sluggish program execution
  • Excessive memory usage

Sluggish program execution

To demonstrate the potential danger of passing Strings and Variants by value, let's do some quick and dirty performance profiling.  First, we'll use the Space() function to create an (admittedly enormous) string.  Then, we'll pass that string by reference to Thing1 and by value to Thing2.

Here's the code:

Sub TimeThings()
Const NumThings = 2
Const Iter = 10
Dim t As Integer, i As Integer, s(1 To NumThings) As Long

    Dim Dummy As String
    Dummy = Space(900000000)
    For t = 1 To NumThings
        s(t) = GetTickCount
        For i = 1 To Iter
            Call Run("Thing" & t, Dummy)
        Next i
        Debug.Print "Thing "; t, GetTickCount - s(t); " ms elapsed"
    Next t
End Sub

Function Thing1(Optional ByRef Val As Variant)
End Function

Function Thing2(Optional ByVal Val As Variant)
End Function

Here's the output:

Passing a huge string by value takes much longer than passing it by reference.

Note that the two Thing functions do not do anything.  Allocating, populating, and deallocating memory is the sole source of the delay when executing Thing 2.

Excessive memory usage

I also stepped through the above code using VMMap, a SysInternals utility from Mark Russinovich.  The dummy string appears in the Heap memory type.  Here's an excerpt from the VMMap Help:

Heap

Heaps represent private memory managed by the user-mode heap manager and, like the Private memory type, is charged against the system commit limit and contains application data. Application memory allocations using the C runtime malloc library, HeapAlloc and LocalAlloc, use Heap memory.

I set breakpoints throughout the code and checked the size of the Heap at each breakpoint.  Here is a summary of the results:

  • 139,468 K: Starting heap size
  • 1,897,328 K: Size after initializing Dummy (+1,757,860 K)
  • 1,897,328 K: Size when calling by reference (No change)
  • 3,655,160 K: Size when calling by value (+1,757,832 K)
Calling by value requires allocating memory to hold our enormous string twice.

Here's a look at the VMMap at each breakpoint:

64-bit vs. 32-bit Access

I ran the above profiling code on a copy of 64-bit Access.  If I had tried running this in 32-bit VBA, it likely would have raised an error for excessive memory usage.  As newer versions of Access use more memory to run the application itself, excessive memory errors are an increasingly frequent problem.  Thus, ByVal code that worked in earlier versions of Access might begin to cause problems in newer versions.

To make matters worse, 32-bit Access is further handicapped to using a maximum of 2 GB of memory.  Microsoft has promised to make 32-bit Access Large Address Aware (LAA) at some point in the future, but there is no planned release date for that feature.  

Philipp Stiefel wrote about a technique to make 32-bit Access Large Address Aware until Microsoft provides official support for it.  Needless to say, his approach is for advanced users who know their application is bumping up against this limit.

Tradeoffs Revisited

In the title of this article, I noted that there are tradeoffs between passing arguments by value versus by reference.

Passing By Value:

  • PRO: Safer from subtle bugs*
  • PRO: More explicit since the ByVal keyword is required
  • CON: Reduced performance (with very large strings)
  • CON: Code is more verbose (since the ByVal keyword is required)
  • *CON: False sense of security could lead to different bugs (see below)

Passing By Reference:

  • PRO: Best performance
  • PRO: Code can be less verbose (since the ByRef keyword is optional)
  • CON: Prone to subtle bugs (especially among novice developers)
  • CON: Default behavior is not obvious to new developers

The downsides of passing by value are generally negligible

The reason VBA passes arguments by reference as its default behavior may have more to do with the state of computing when the language was conceived.  In those early days, memory was measured in megabytes or even kilobytes.  It was critically important to watch every byte.

Running out of memory is still a concern with 32-bit versions of Access, but you'll burn through a lot more memory adding nested subforms and subreports than you will with your run-of-the-mill routine call.  Even if you do start getting excessive memory usage errors, arbitrarily switching a bunch of parameters from ByVal to ByRef is unlikely to fix the issue.

In 99.9(999)% of cases today, the actual performance impact of passing parameters by value is negligible.  However, in certain specific edge cases it absolutely causes performance issues.  In real world usage, the most likely time you would run into such a case is if you load the contents of a text file into a single string for processing.  Thus, the main downside to passing by value can be ignored.  In those situations where performance problems do appear as a result of passing by value, it's easy enough to modify the code and switch to passing by reference instead.

The downsides of passing by reference are easily mitigated

That said, I think the main downside to passing by reference is easily mitigated.  The important thing is to simply understand the differences between passing by reference and passing by value.  

In my own code, I rarely assign values to variables passed by reference.   And when I do, I always include the optional ByRef keyword to make my intent explicit.

For all intents and purposes, I treat variables passed by reference as if they were read only.  If I need to modify the value within my routine, I assign the value to a local variable first and make any needed modifications to the value of the local variable.

* Objects are always passed by reference

In my list of CONs for passing by value, I noted that passing by value could lead to a false sense of security when it comes to passing objects.  The reason for this is that when you pass an object to a routine–as opposed to a value data type, such as Int, Date, Currency, etc.–the object is always passed by reference, whether you use the ByVal or ByRef keyword.

That's because when you pass an object instance to a routine, what you are actually passing is a pointer to where that object is held in memory.  So, if you use the ByVal keyword to pass an object instance, the thing being passed by value is the pointer and not the value of the object.  

If you modify any of the object's property values, you will be modifying the property values of the same object that exists in the calling procedure.  Creating a full copy of an object in memory is not a trivial task.  In fact, if you had the need to do such a thing, you would have to create a method of the object to perform the copy.  Even then, you might want multiple methods, such as .ShallowCopy and .DeepCopy, depending on the particulars of the class.

The seedy underbelly of passing by value

This is the reason I say passing by value can lead to its own subtle bugs.  Developers with a certain level of understanding will know that passing by value creates a copy of the argument being passed.  If those developers don't appreciate the differences between value and reference types, though, they may be lulled into a false sense of security by using the ByVal keyword when passing object references.

The following function declaration should throw up major red flags to an experienced developer:

Public Function SortCollection(ByVal Items As Collection) As Collection

The inclusion of ByVal–and the fact that this is a function that returns a Collection object–would lead me to infer the developer is intending to sort a copy of the passed collection without affecting the collection object in the calling procedure.  Unfortunately, though, that's not going to happen.

Perhaps a MisleadingByValParameter code inspection is in order?


UPDATE (4/5/2021): Changed the URL pointing to the RubberDuckVBA code inspection to use the preferred format: https://rubberduckvba.com/inspections/details/ParameterCanBeByVal

All original code samples by Mike Wolfe are licensed under CC BY 4.0