If you are familiar with Babel, you may had seen the various plugins, utilities, and core packages that are available to install. To give you a better sense of how many packages are out there, Babel has published nearly 150 packages to NPM.
Our team also publishes packages to NPM. We have a corresponding GitHub repository for each package. While we were only at 10% of Babel’s total package count, it was already difficult for us to maintain the growing number of repos.
For example, a problem that we faced was keeping our local dev environment up to date. If a package was updated, we had to pull it down and rebuild with the latest changes. Since 14 or so developers had contributed to the project, multiple packages may have been updated in a single day. As a result, we had to spend a lot of time switching repos, checking out the right branch, pulling down the code, npm installing, etc.
I investigated their GitHub repository to gain some insights on what they were doing differently and noticed this in their FAQ.
I had learned that Babel is organized as a monorepo. In this design, all ~150 packages maintained by Babel are stored in the same repository. It may sound like a monolithic design, but I would actually consider it to be a hybrid between that and a multi-repo design. To clarify some terminology before going further.
- Monolith: All of the dependent code are stored in one repository and are deployed together.
- Multi-repo: Codebase is spread across multiple repositories. Packages can be deployed independent of one another.
- Monorepo: Contains multiple packages in one repository. Packages can be deployed independent of one another.
Our team initially started off with a multi-repo design. Another issue that we ran into with this pattern was that if we wanted to automate our deploys, we had to create a 1:1 relationship between our repositories and our Jenkin jobs. Not only did we have to maintain the growing number of repos, we now had to maintain the growing number of jobs and deploy keys as well.
We can use Lerna to help with some of the troubles listed above. To start with, we only had to create a single GitHub repository and a single Jenkins job, which removes a ton of setup overhead. Since all of the packages are now under one git tree, we are one
git pull away from keeping all of the packages up to date.
We technically haven’t used Lerna yet; those are just inherent advantages of having a single GitHub repo. The part that we need Lerna for is to be able to manage the packages within that single repo. Specifically, going back to the idea of only deploying the packages that have changed.
Lerna provides a way to perform scripts on either all or just a subset of the packages. It comes with scripts such as
lerna bootstrap or lerna version, which handles the install and publishing of packages respectively.
Custom scripts can be achieved by using either the
lerna run <script> command, which runs the associated npm script for each package that contains that script, or the
lerna exec <script> command, which runs the provided script for each package.
All of these scripts can be further enhanced by passing in additional flags. Since some of the scripts run on all of the packages by default, we can use the custom flags
--include-merged-tags together to target only the packages that have changed. Additional scripts and flags can be found in their official documentation.
Putting it all together
These are the scripts that are executed by our Jenkins job.
- npm install
- lerna bootstrap --since --include-merged-tags
- lerna run lint --since --include-merged-tags
- lerna run test --since --include-merged-tags
- git reset --hard
- lerna publish patch --include-merged-tags --yes
- lerna exec --since --include-merged-tags -- node ‘../../saveConfig.js’
First, we install Lerna and other root level dependencies via
npm install. Then, we install the dependencies only for the packages that have changed. We validate that it passes both our lint and test checks. We also make sure the working directory is clean before publishing. Once the packages are published, we execute the custom script
saveConfig located at the project root. The custom script saves our package metadata to our database.
The first time I migrated a project to use Lerna, I moved the packages to the new repo by copy pasting everything over. Turns out, there was a script,
lerna import, that does just that and more. There is a learning curve in setting up and using the scripts for the first time. Lerna will do a lot of the heavy lifting for you, it is just a matter of learning what is available.
With Lerna, more than one package may be published from a single build script. This might make recovering from failed builds tricky depending on what step failed. Didn’t pass lint and test step? No problem, fix those and try again. Some packages suceeded but some failed while publishing to NPM? A little bit trickier.
lerna publish will attempt to push tags to GitHub and publish to NPM. If connectivity issues occur, it may lead to the situation described above. To recover from this, you can use the command
lerna publish from-package to republish only the failed packages.
Switching over to a Lerna monorepo has saved our team countless time and effort. We extended the use to also manage and deploy our lambda functions. There are various other neat things you can do with it that is not covered in this article.
A monorepo design is problably not the best design for most projects. I would suggest switching over to a monorepo design if there are multiple packages and managing them all becomes too difficult. Existing projects could always be migrated to a monorepo at any time; Lerna provides a simple way of creating the repo and migrating existing packages via the
lerna init and
lerna import commands.
Here is what Babel thinks of it:
Popular projects that are using Lerna include React, Angular, Jest, and many others.
For a more in-depth explaination on the actual structure and the various lerna commands, their documentation is a great resource.
tl;dr By using Lerna, we get the advantages of a single repo, while still having the capabilities of deploying and publishing only the packages that have changed.