Ruby: Pass By Value or Pass By Reference?

Recently I wanted to build a method that takes an array as an argument and mutates the original array without needing to return a new one. I wanted to quickly find out online if Ruby treats arrays as a value or reference when you pass them into a method (see this wonderful illustration if you are unfamiliar with the concept).

Most of the answers involved long winded technical descriptions, but I just wanted a quick, practical answer, so here it is:

How it works in practice:

Everything in Ruby is an object, and when that object is passed into a method it is treated, for most practical purposes, as you would expect a pass-by-reference language to behave. In other words, if you pass an array into a method and do an operation on it that mutates its values, it will alter that array everywhere. If you perform an operation on it that returns a new object (as many array methods do), it will not alter the original. So, whether or not the original object is altered will depend on what method you call.

The following uses the “<<“ method inside a function, which mutates the value of an existing array:

def do_something_to_array(arr)
  arr << rand(1..10)
end

mutate_me = []
do_something_to_array(mutate_me)
puts mutate_me
=> [4]

do_something_to_array(mutate_me)
puts mutate_me
=> [4, 6]

do_something_to_array(mutate_me)
puts mutate_me
=> [4, 6, 3]

On the other hand, if you perform an operation that returns a new object, the original object will not be changed. Here is an example with the “+” method, which always returns a new array:

def do_something_to_array(arr)
  arr + [rand(1..10)]
end

i_wont_mutate = []
do_something_to_array(i_wont_mutate)
=> [5]

i_wont_mutate
=> []

Strings work in much the same way. Many string methods will mutate the original. If you mutate the original inside a method, it will be changed everywhere:

def do_something_to_string(str)
  str << "Append me!"
end

stringy = "I will be changed!"

do_something_to_string(stringy)
 => "I will be changed!Append me!"

stringy
 => "I will be changed!Append me!"

Now we call a method that always returns a new string, therefore the original string is not altered:

def do_something_to_string(str)
  str += "Append me!"
end

stringy = "I won't be changed!"

do_something_to_string(stringy)
 => "I won't be changed!Append me!"

stringy
=> "I won't be changed!"

Numeric values, for all practical purposes, are immutable. So they tend to act more like “pass by value” because the methods you call on a numeric value will always return a brand new object, leaving the original unaltered.

Now for a little more computer-sciencey nuance…

Is the following output what you would expect if Ruby were truly “pass by reference”?

def do_something_to_string(str)
  str << "Append me! "
  str << "Append me too! "
  str << "Append me three!"
  str = "What object am I?"
end

stringy = "What will become of me? "

do_something_to_string(stringy)
 => "What object am I?"

stringy
=> "What will become of me? Append me! Append me too! Append me three!"

What is the explanation for these confusing results? The reason this happens is because Ruby isn’t truly pass by reference. It just often acts in a similar way. At a technical level, Ruby is entirely pass-by-value. The value that gets passed into do_something_to_string is a reference to the same object that stringy points at. This is not the same as pass-by-reference. str is an entirely new object, with a reference to the same object that stringy points at. When we use the append operator (<<), we are altering that underlying object, and therefore we alter both str and stringy. However, when we use the assignment operator (=), a new string object is created and str‘s reference is updated to point to that new object.

If ruby were truly “pass by reference” altering what str points to would actually alter what stringy points to as well.

Theoretically confusing? Yes. But practically speaking it is very simple. Immutable objects like those of type Numeric will always act like “pass by value”. For other object types passed as arguments, if you call a method on that argument that mutates the object, it will mutate that object everywhere, even outside of the method. If you call a method that returns a new object, it will not affect the original object and its effects will be localized inside the method.

Ruby: Pass By Value or Pass By Reference?

Dynamic Typing Vs Static Typing Vs Weak Typing Vs Strong Typing

A common misconception is that dynamic typing (as in variable types… not as in your keyboard…) is synonymous with weak typing and that static typing is synonymous with strong typing. This is an untrue assumption. Dynamic typing is the opposite of static typing and weak typing is the opposite of strong typing and both of these sets represent entirely different concepts.

Dynamic Vs Static Typing

A statically typed language is one in which you must declare all variable types before compilation (and in a few rare cases, interpretation). Types are checked at compilation thus allowing the compiler to detect any errors with your variable types. A dynamically typed language, on the other hand, will check your variable types at run time. Dynamically typed languages are thus able to dynamically assign variable types depending on the input. This often saves programmer time and makes for simpler code.

While a statically typed language can save system resources by compiling the code ahead of time thus eliminating the need for type-checking and resource allocation at run time, there are some memory advantages to a dynamically typed language. A dynamically typed language can (in some cases) save some memory as the interpreter is able to assign only what is necessary to fit the data entered. In a statically typed language, on the other hand, the programmer must predict ahead of time the maximum amount of space that some variable input (for example, from a file or database entry) will consume and reserve that space ahead of time. If a smaller amount of data enters the program, any extra memory that has been reserved for this data is wasted. Overall, however, a program written in a statically typed language is typically more efficient than a program written in a dynamically typed language. The advantages that a dynamically typed language offers primarily have to do with making the programmer’s job faster and easier (at the expense of some extra system resources). In many circumstances in our modern world, this trade off is well worth it.

Weak Typing Vs. Strong Typing

The first thing that should be noted when talking about a weakly typed vs strongly typed language is that there is not really a rigorous technical definition for either term; therefore, using such terms should be avoided altogether. However, I include them here for the sake of contrasting the common usage of these terms with that of dynamic/static typing.

A strongly typed language is generally regarded as a language in which variables must be explicitly converted when comparing two variables of a different type. In other words 1 + “1” is an impossible operation and 1==”1″ is an impossible comparison. In general, in a strongly typed language both of these statements would return an error. In a strongly typed language variables of different types must be explicitly converted in order to perform operations and/or comparisons.

A weakly typed language, on the other hand, will usually implicitly convert variables of different types when the programmer attempts a comparison or operation on them. Usually weakly typed languages have elaborate rules (though often consistent with many other weakly typed languages) stating how conversions occur implicitly (or automatically, without the programmer needing to explicitly state that a variable is to be converted). For example, the operation 1.0 + 1 will likely return a float value in a weakly typed language, whereas it would likely return an error in a language that is considered to be strongly typed. This is because technically this statement is comparing a floating point number with an integer.

Dynamically Strong and Statically Weak

It is also a misconception that a dynamically typed language must also be weakly typed and that a statically typed language must be strongly typed. This is also untrue. Python is a good counterexample to the first scenario. Python is a dynamically typed language; however, it is also generally considered to be strongly typed. On the other hand, while both C and C++ are statically typed, they are also considered to be weakly typed; that is, there are many ways in which variable conversions occur where the programmer does not have to explicitly direct the program to make such a conversion (1.0/1, for example, will return a floating point number).

Dynamically Compiled and Statically Interpreted

Finally, I should also note that, while dynamically typed languages are generally interpreted and statically typed languages are generally compiled, this is certainly not a requirement either. Many dynamically typed languages can be compiled, for example, Python and Ruby. However, when these languages are compiled they are still not typically as efficient as their statically typed counterparts due to the extra overhead of runtime type checking.

While knowing the differences in these terms will not make or break you as a programmer, it can be useful to understand the differences so you properly understand what is going on “under the hood” with your language. It is also important to understand these definitions so you can properly understand and possibly even partake in discussions with your peers and coworkers and speak intelligently about such things to those around you.

Dynamic Typing Vs Static Typing Vs Weak Typing Vs Strong Typing