x

[eBook Presented by AWS & Thundra] Mastering Observability on the Cloud 📖 Download:

Mono- or Multi-Repository: A Dilemma in the Serverless World

Aug 13, 2020

                                    Mono- or Multi-Repository_ A Dilemma in the Serverless World

Mono-repo or multi-repo is a decision every organization must make at a certain point if it has a growing number of services. As your organization grows, speed and performance become critical, and you’ll need to decide whether to structure your services in a single repository or use a separate one for each service.

Developer productivity and fast delivery of business values are a necessity if you want to take on the competitors. Simple decisions like how you organize and structure your services have significant impact on developer productivity, collaboration, and communication.

Although serverless functions scale automatically, engineers frequently break large projects into independent services that solve a business problem, with each service containing one or more functions. Serverless deployment frameworks allow you to deploy functions in terms of service, but the decision on how to organize or structure your services in version control is yours.

Google, Dropbox, Facebook, and Twitter are well known for using a mono-repo pattern, while Amazon and Netflix are famous for using the multi-repo approach. Organizations have various reasons for structuring their projects in a certain way, and in this post, we'll compare the two approaches and learn how you can make the right choice for your organization.

What Is Mono-repo?

In a mono-repo approach, all services and codebase are kept in a single repository.

When engineers think of mono-repo, they generally think of a single-tiered application that runs multiple components in the same process, on the same system, packaged and deployed as a monolith. Yet a mono-repository app does not mean a monolithic app. You could house multiple services in a single repository and build and deploy each service independently. We can look at mono-repo as a way of structuring your project so that:

  • multiple independent services live in the same code repository;
  • the services may share a common code or libraries;
  • a change to one service does not necessarily rebuild the entire project, but only the service that is affected by the change.

Most organizations are switching from multi-repo to mono-repo, and there are good reasons for this.

Why Use Mono-repo?

Below we’ll examine some advantages of using mono-repo for your serverless app.

Easy Onboarding of New Engineers

Onboarding is a critical process for both new hires and the hiring company. The best way to equip engineers with the knowledge and tools they need to do their job is to get them up and running in their dev environment as quickly as possible.

The wrong way to start the onboarding process is to have new hires clone ten different repositories in their first week before running your application. The more repositories you have, the more difficult it is to understand the big picture and how each service relates to the others. Mono-repo is great for helping engineers get up and running in no time.

Fostering Collaboration and Communication among Developers

You can't build quality software without effective collaboration. With a single place to version your serverless app, your engineers have a centralized place to collaborate, track features, and work together on a shared infrastructure.

Microsoft, which owns the largest mono-repo in the world, observed that the transition from multi-repo to mono-repo helped break a siloed culture that often comes with multiple repositories and provide a better developer experience. If collaboration among your engineers is not good, going multi-repo won't make it any better.

Simplifying Dependencies Management

A serverless deployment package includes functions and dependencies (usually external libraries). Serverless platforms like AWS Lambda typically have a size limit on deployment packages. It's best practice to package your app dependencies in layers that can be reused by other services. This ensures that your deployment package does not get too large. Coordinating and managing dependencies is a lot easier in mono-repo than in multi-repo.

Easier Global Refactoring or Bug Fixing

For a mid-size project residing in a single repository, refactoring is way more natural. With the IDE’s refactor command, you can quickly issue bug fixes that impact multiple services or refactor methods, functions, and classes. With multiple repositories, complexity sets in.

Mono-repo Has Some Drawbacks

Mono-repo is excellent in many ways, but it does have some drawbacks.

Performance

Mono-repo is great for a small or medium project. For a large project, performance problems begin to creep in at every phase, and the repository becomes too slow to check out, clone, or pull. Standard git commands like git status can take seconds to run. File searches become too slow due to a large number of files to search.

CI System Is Complicated for Mono-repo

With a traditional CI system that works only per project, building a CI system for a mono-repository that holds multiple services can be very challenging due to constant builds as multiple people commit to the same repository. If your CI system is not well architected, it could be doing unnecessary work as it rebuilds all services on each push.

Without a granular way of configuring your CI system to build services affected by a change, the cost of the system could skyrocket due to the number of builds taking place during development by a large team.

What Is Multi-Repo and Why Use It?

Multi-repo is a way of organizing your services into separate repositories. It has several advantages:

Each Service Can Be Versioned Separately

As your services grow in mono-repo (especially with binary dependencies), you'll eventually get to the point where code checkout and clones become too big for engineers' IDE to handle. This could slow your team down.

In multi-repo, services are small and versioned separately, with small code checkouts devoid of the performance issues that often come with a mono-repo approach.

Teams Can Have a Separate Repository for Different Areas of Responsibility

A high-performing engineering team usually consists of smaller teams, with each service owned and maintained by a team. A team could consist of five to seven engineers, a so-called two-pizza team.

The multi-repo approach allows the microservice team to have a separate and isolated repository for its different areas of responsibility. A two-pizza team could own a codebase end-to-end, independently develop features, and deploy them.

Drawbacks of the Multi-Repo Approach

Multi-repo is not a silver bullet, and it has both pros and cons. Below are some of the drawbacks of multi-repo.

Managing Versions and Tracking Dependencies Globally

A significant drawback of multi-repo in serverless apps is that it’s difficult to keep track of versions and dependencies globally. You'll need to keep in mind versions and variables and continuously update them. The problem is compounded when different teams own different services. Coordinating fixes or features that cut through multiple teams can be daunting.

Code and Dependency Duplication

There is often code duplication in a multi-repo approach, since multi-repo tends to encourage a siloed culture in which each team does its own thing, making it hard to prevent teams from solving the same problem repeatedly.

This problem tends to go away when there is better communication across teams and shared libraries are released that can be installed by teams. It's also important to note that managing shared libraries has complexities as well and that you have to be careful not to break public APIs of libraries that multiple teams are already using in production.

Enforcing Patterns and Best Practices Is Hard

In a big organization, multiple repositories are often owned and maintained by different engineering teams. This makes it difficult to enforce common patterns and best practices, and there’s always a possibility that one team will do things differently. It's difficult for developers to fully understand the big picture since each team is confined to its own repository, so code reviews are sometimes less effective than they could be.

Which Approach Should You Use?

Each approach has its pros and cons. If you have a small development team, perhaps you're just starting a new project with a small number of services. In that case, you should consider using a mono-repo for the project.

If you’re a development team that does not have access to the tools and resources required to manage the complexity of a sizable mono-repo project, breaking up your codebase might just give you the productivity boost you need.

It's essential to bear in mind that going multi-repo comes with trade-offs, as Adam Jacob says in this blog post: "The default behavior of a multi-repo is isolation, and the default behavior of a monorepo is shared responsibility and visibility."

If communication and visibility within your engineering teams are already bad, going multi-repo will not make them better.

On the other hand, when you have services that need to be released together, you should consider having them in a single repository to avoid the pain of coordinating services in different repositories that must be released together.

Mono or Multi? There Is No Silver Bullet

We've now seen both sides of mono-repo and multi-repo. We’ve discussed when mono-repo might make sense and when it might not. We've covered a lot of ground in this post to reassert these points:

  • Mono-repo makes it easier for engineers to understand the big picture of their project.
  • Mono-repo fosters collaboration among engineers.
  • Mono-repo simplifies dependency management.
  • In a large codebase, performance problems creep in mono-repo.
  • Multi-repo allows you to separate teams’ areas of responsibility.
  • Since each service resides in its repository, multi-repo often helps you avoid the performance problems you may experience in mono-repo.
  • Managing dependencies in multi-repo is complex.
  • It's more difficult to enforce common best practices and patterns in multi-repo.