Understanding and Working with Submodules in Git

Most modern software projects depend on the work of others. It would be a waste of time to reinvent the wheel in your own code when someone else has already written a wonderful solution. That’s why so many projects use third-party code in the form of libraries or modules.

Git, the world’s most popular version control system, offers a great way to manage these dependencies in an elegant, robust way. Its “submodule” concept allows us to include and manage third-party libraries while keeping them cleanly separated from our own code.

In this article, you’ll learn why submodules in Git are so useful, what they actually are, and how they work.

Keeping Code Separate

To make clear why Git’s submodules are indeed an invaluable structure, let’s look at a case without submodules. When you need to include third-party code (such as an open-source library) you can of course go the easy way: just download the code from GitHub and dump it somewhere into your project. While certainly quick, this approach is definitely dirty for a couple of reasons:

  • By brute force copying third-party code into your project, you’re effectively mixing multiple projects into one. The line between your own project and that of someone else (the library) starts to get blurry.
  • Whenever you need to update the library code (because its maintainer delivered a great new feature or fixed a nasty bug) you again have to download, copy, and paste. This quickly becomes a tedious process.

The general rule in software development to “keep separate things separate” exists for a reason. And it’s certainly true for managing third-party code in your own projects. Luckily, Git’s submodule concept was made for exactly these situations.

But of course, submodules aren’t the only available solution for this kind of problem. You could also use one of the various “package manager” systems that many modern languages and frameworks provide. And there’s nothing wrong about that!

However, you could argue that Git’s submodule architecture comes with a couple of advantages:

  • Submodules provide a consistent, reliable interface — no matter what language or framework you’re using. Especially if you’re working with multiple technologies, each one might have its own package manager with its own set of rules and commands. Submodules, on the other hand, always work the same.
  • Not every piece of code might be available over a package manager. Maybe you just want to share your own code between two projects — a situation where submodules might offer the simplest possible workflow.

What Git Submodules Really Are

Submodules in Git are really just standard Git repositories. No fancy innovation, just the same Git repositories that we all know so well by now. This is also part of the power of submodules: they’re so robust and straightforward because they are so “boring” (from a technological point of view) and field-tested.

The only thing that makes a Git repository a submodule is that it’s placed inside another, parent Git repository.

Other than that, a Git submodule remains a fully functional repository: you can perform all the actions that you already know from your “normal” Git work — from modifying files, all the way to committing, pulling and pushing. Everything’s possible in a submodule.

Adding a Submodule

Let’s take the classic example and say we’d like to add a third-party library to our project. Before we go get any code, it makes sense to create a separate folder where things like these can have a home:

$ mkdir lib $ cd lib 

Now we’re ready to pump some third-party code into our project — but in an orderly fashion, using submodules. Let’s say we need a little “timezone converter” JavaScript library:

$ git submodule add https://github.com/spencermountain/spacetime.git 

When we run this command, Git starts cloning the repository into our project, as a submodule:

Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done. 

And if we take a look at our working copy folder, we can see that the library files have in fact arrived in our project.

Our library files are here, included in a submodule

“So what’s the difference?” you might ask. After all, the third-party library’s files are here, just like they would be if we had copy-pasted them. The crucial difference is indeed that they are contained in their own Git repository! Had we just downloaded some files, thrown them into our project and then committed them — like the other files in our project — they would have been part of the same Git repository. The submodule, however, makes sure that the library files don’t “leak” into our main project’s repository.

Let’s see what else has happened: a new .gitmodules file has been created in the root folder of our main project. Here’s what it contains:

[submodule "lib/spacetime"] path = lib/spacetime url = https://github.com/spencermountain/spacetime.git 

This .gitmodules file is one of multiple places where Git keeps track of the submodules in our project. Another one is .git/config, which now ends like this:

[submodule "lib/spacetime"] url = https://github.com/spencermountain/spacetime.git active = true 

And finally, Git also keeps a copy of each submodule’s .git repository in an internal .git/modules folder.

All of these are technical details you don’t have to remember. However, it probably helps you to understand that the internal maintenance of Git submodules is quite complex. That’s why it’s important to take one thing away: don’t mess with Git submodule configuration by hand! If you want to move, delete, or otherwise manipulate a submodule, please do yourself a favor and do not try this manually. Either use the proper Git commands or a desktop GUI for Git like “Tower”, which takes care of these details for you.

Git desktop GUIs like Tower make handling Git submodules easier

Let’s have a look at the status of our main project, now that we’ve added the submodule:

$ git status On branch master Changes to be committed: (use "git restore --staged <file>..." to unstage) new file: .gitmodules new file: lib/spacetime 

As you can see, Git regards adding a submodule as a change like any other. Accordingly, we have to commit this change like any other:

$ git commit -m "Add timezone converter library as a submodule" 

Continue reading Understanding and Working with Submodules in Git on SitePoint.

Similar Posts