This article will explain what data types are, what is meant by the terms “static”, “dynamic”, “strong” or “weak” when we talk about data types, and why should we even care.
What Are Data Types Anyway?
Table of Contents
If you’ve done any programming, you surely have seen variables, parameters or values returned from functions. They’re everywhere in the programming world. Many programmers begin using them without really knowing what they’re specifying to the computer under the hood. When defining these values, a programmer is specifying to the computer what a variable will be named but also telling the computer what type of data it is. Is it an integer? Is it a string of characters? Is it a single character or a complex type like a Point
? To understand data types, we may find it easier to flip around the term and think of it as “the type of data” we’re dealing with.
If you’ve sifted through information on the Web, you might have read conflicting information about “static” vs “dynamic” data types, as well as “strong” vs “weak” data types. They’re not the same thing. As we go through the different terms below, please keep in mind that a language can include a combination of static/dynamic and strong/weak data types. They’re not exclusive. For instance, a language can be static and strong or static and weak. But before we go too far into defining these terms, why should we even care?
Why Should We Care About Data Types?
Every programming language has a data type system. Without a type system, computers wouldn’t know how to represent the data in our programs. They wouldn’t know how to take this type of data and add it to this other type of data or even how to store the data. By specifying a variable as an integer, the computer knows how many bytes it needs to represent the value and, more importantly, how it can perform operations with it. Adding two integers together is a different operation from adding together two strings. The only way the computer knows how to handle the data is by knowing the types of data it’s dealing with.
You can begin programming without knowing much about the underlying type systems at play. That’s part of the beauty of these higher-level languages. But understanding the types of data you’re dealing with, and how best to represent the data you’re working with, brings huge advantages, such as those listed below.
You’ll have more efficient programs that make better use of memory and storage. By knowing that an array of integers occupies less space than an array of doubles can lead to kilobytes or even megabytes of savings when dealing with very large arrays in memory or on the hard disk.
It will help you decipher debugging messages and gain a better understanding of problems related to data types. When the program complains it can’t add two numbers because one of the numbers is a string, you understand why. Learning this fact will help you avoid making the mistake of trying to add a number to a string to begin with.
Once you know how types work, you can be like Neo from the movie The Matrix and know how to bend the rules. You’ll know when adding an integer to a character how you can get another character and why that works.
Static vs Dynamic Data Typing Systems
So, data types are how we tell the computer the type of data we’re dealing with. However, when a programmer says a language’s type system is static or dynamic, what are they referring to?
Static data typed languages are those languages that require the programmer to explicitly define a data type when they create a piece of data (be it a variable, parameter, return value, and so on). Typically, these types are also fixed as being that type for the lifetime of the program, and they don’t change their type. Let’s take a look at an example:
int myNumber = 42; // integer data type string name = "Rocky Balboa"; // string data type final double PI = 3.141592; // double data type // Function takes two integer data types, returns an integer data type public int add(int a, int b) { return a + b; }
This example above shows a few variables being defined, and an example of a function that adds two numbers together. As you can see, we are explicitly telling the language (in this case, Java) that we’re dealing with integers, strings and doubles. Without these hints to the compiler, the compiler would not know how to treat myNumber
. This is just a name that makes sense to us, not the computer.
Some languages that are statically typed include Java, C#, C++ and Go. But these are just a few of many.
Let’s contrast this with a dynamic data typed language. Below is an example:
$myNumber = 42; // integer data type $name = "Rocky Balboa"; // string data type $PI = 3.141592; // float data type // Function takes two integer data types, returns an integer data type function add($a, $b) { return $a + $b; }
So, where did all the types go? In this example above, we see that our variables in PHP, a dynamically typed language, don’t explicitly say what types of data they are. If we’re not telling PHP the data types, how does it know? Well, it guesses based on the value being assigned to it. It can correctly guess that $myNumber
is an integer because the value 42 is an integer.
What about the function? It too makes a guess based on the values passed to it. This means that add()
could take two integers and return an integer, or it also could take two floats and return a float. The types are inferred and can even change during runtime. We could later, in the same program, say that $myNumber
is equal to Tom
and it will let us switch the variable to a string.
A few of the many languages that are dynamic include JavaScript, PHP, Python and Ruby.
Why Would You Prefer Static Over Dynamic or Vice Versa?
In the case of static data typed languages, by explicitly telling the compiler up front the types of data you’re dealing with, it can catch common mistakes and errors in code long before deployment. If you define one value to be an integer and another to be string, the compiler can catch an addition error during compile time and won’t let you finish building the program. This is good in that the earlier you catch an error, the stronger your code will be, and the less it will cost you and your clients to fix it. It’s much easier to fix things early than later after deployment.
So static is the way to go right? Well, the tradeoff is that you have to explicitly define everything before you use it. You have to type more code, you have to know the type of data you’re dealing with beforehand (not always something you know) and you have to know what’s going to happen in your operations. You have to know that 1 / 3
is going to give you 0
and not .33333
and things like that.
Dynamic languages give you added flexibility in this area. They’re often described by programmers as being “more expressive”. In PHP, for instance, you’ll get .3333...
as you might expect. However, the problem is that if the interpreter makes a wrong guess in the data types, you have to know about it. Otherwise it can sneak by. Since we can’t catch everything, code in dynamic languages tends to be a bit more error prone and brittle. Data types in these dynamic languages are typically determined at runtime. This makes it hard to catch many errors until they reach a production environment. It may work fine on your local development machine, but the production runtime environment could be slightly different, yielding some different guesswork by the interpreter.
JavaScript is one language that is considered dynamic. With the introduction of TypeScript, a superset of JavaScript, programmers introduced the idea of explicitly declaring data types for variables in order to make the language more static. As JavaScript’s popularity has grown — even beyond the browser with tools like Node.js — programmers wanted to add the benefits of static data typing to take out some of the wrong guesswork JavaScript does when dealing with data types. JavaScript is notorious for some of its bad guesswork when dealing with data and its types. This is an example of making JavaScript, a dynamic language, into something more like a statically typed language to catch errors early and lead to healthier code. This is all the more important given that JavaScript is making its way into server-side applications with the help of Node.js.
In short, static data types give you a strict environment and typically yield stronger code. Dynamic languages give you flexibility and the ability to write code faster, but can lead to more error prone code if you’re not careful in checking your types.
Continue reading Introduction to Static, Dynamic, Strong and Weak Data Types on SitePoint.