Back to the Trunk: Cutting Heavy Feature Branches

Image by Suket Dedhia from Pixabay

This article is written after an evaluation of trunk-based development and feature toggles for my project. It might be opinionated, but I tried to present a holistic explanation of why you might want to move from feature branching to trunk-based development and how you can do that using a set of different techniques and best practices.

Feature Branching

Feature branching is an approach where you create a branch in your repository for every task or feature. Depending on the size of your features, implementing them may take weeks and even months. Supporting such feature branches requires resolving merge conflicts coming from the main branch and is therefore costly and error-prone.

If you work in a large team, then a siloed process of working on the feature code means the other developers don’t share a significant part of the implementation until you drop a heavy chunk of changes to the main branch at once, causing cognitive overload and adaptation overhead for your teammates.

If you adhere to the infrastructure-as-code approach, then the deployment of a new feature includes not only new code but also infrastructure changes, configuration changes, and database migrations. Merging everything at once may also incur an uneven load for other teams, like Infrastructure, QA, or Operations.

Avoiding long-living branches

Why you would like to avoid long-living feature branches:

  • avoid support costs due to merging of the main branch into feature branches;
  • have a faster and lighter turnaround of code between the developers and teams;
  • move towards releasing and deploying small steps instead of “big bangs” for safety and confidence reasons;
  • make smaller changes available for smoke testing or even partial feature testing.

Some definitions

By main branch, or trunk, I mean the branch where all developers integrate their work. This is usually the branch you use for releases and/or deployment. You should have enough automated tests to make sure this branch is always green and deployable.

By partial task, I mean a ticket, issue, task or a small feature that delivers partial changes and is part of a bigger feature. It might not be testable or usable, but it should not break the main branch: all tests, including your regression test cases, should pass as before.

What is a long-living branch, and how much is “long”? There’s no clear answer. The branch becomes long-living once you need to constantly tend to it, merging upstream or main branch, and this takes a significant chunk of your time, causing merge conflicts and adaptation of the upstream code to your changes.

Trunk-Based Development

If you’re tired of long-living feature branches, you might want to move in the direction of trunk-based development, shortening the lifetime of the branches and integrating them into the main branch sooner rather than later.

The obvious solution is to break down large features into smaller tasks that can be merged into the main branch without breaking the existing functionality. For instance, you could split a large feature into smaller ones that don’t itself provide any useful functionality. Or you could even merge the feature task-by-task into the main branch.

The solution has the obvious drawback — there’s additional overhead to make sure the parts of the feature indeed don’t break the existing functionality. At the first glance, this overhead may vary from 0 to indefinitely high values, making such a split impractical.

There is, however, an additional benefit to this approach, except for the ones already described above. Introducing gradual changes, using techniques like branching by abstraction, requires more deep code and architecture analysis, improvement of test coverage, cleaner decomposition, better decoupling, and separation of concerns. I would argue that this improves architecture and code in general.

Some of the approaches that you can use:

  • merging partial tasks as soon as they are ready if they don’t break anything;
  • running the new implementation alongside the old one, in idle mode, under the same load or partial load;
  • using the “branch by abstraction” pattern;
  • using feature toggles to hide unfinished functionality.

Merging partial tasks if they don’t break anything

This is the most simple and straightforward solution. For example, if you only need to add new data structures to the database, then merging the migration early won’t break anything. No special handling is needed for such partial tasks.

Running new implementation in parallel

Suppose you’ve merged a new partially implemented service that already has business event processing logic but no UI. It starts working in parallel with the old implementation. It will crunch the data or persist it, but it won’t be actively used, as there will be no UI or API to access it, which will come later. However, production load issues might already be discovered at this stage, while not causing any system-wide outage.

If you need to implement an incompatible data structure, it makes sense to hide both the old and the new data structure under some abstraction or interface and let them be used in parallel, i.e. all writes should go in both data structures, and reads should be done from the old one. Some additional checks might be implemented to ensure the structures stay consistent.

Branch by Abstraction

The branch by abstraction is perfectly described by Martin Fowler, so I won’t repeat it here. The idea is to abstract the changing functionality to cover both old and new implementation, providing decoupling and decomposition where necessary, and additional tests if needed. When the new implementation is ready, you start migrating the clients to this new implementation gradually or in a single sweep.

I would propose branch by abstraction pattern as the first one to consider in every case before resorting to feature toggles.

Feature Toggles

When I talk about feature toggles I mean specifically release toggles — flags that are used to facilitate trunk-based development and continuous delivery while preventing leaking of untested or unfinished functionality into production.

I leave permission, experiment and ops toggles out of scope, as they have different motivations and requirements and might not be useful for you at all. In particular, I leave out of scope such things as A/B testing, canary releases, gradual rolling of the features for different user cohorts, etc. If this sounds like your case, you may read about them in this great article by Pete Hodgson.

Feature toggles should be your last resort if other approaches fail since they incur the cost of additional testing and maintenance.

The point of a feature toggle is to keep the new code idle and re-route the execution down the old path, while the toggle is OFF. As soon as the toggle is switched to ON, the new code execution path is activated.

How dynamic should feature toggles be?

Dynamic toggles are more complex to implement. Just adding a parameter to the service configuration is not enough —you need to be able to change it at runtime and observe the expected behavior. In a microservice world, you need to be able to consistently toggle across multiple microservices.

If a toggle involves changing the message processing, in-flight messages between the services might be processed incorrectly after the switch is toggled: imagine they were sent by a pre-toggle service A, but consumed by a post-toggle service B. Solving those problems and corner cases is hard. Moreover, dynamic toggles might go against the “infrastructure-as-code” approach.

On the other side, if your application infrastructure and deployment is not that dynamic, then dynamic toggles might not be that relevant for you. The only case where they could be beneficial is when comparing the behavior of the system before and after the switch.

How to configure feature toggles?

Feature toggles could be specified in the global system configuration and switched on or off at deployment time only. The feature toggle may be added as an environment variable that will be accessible to all services. The feature toggle environment variable might look like FT_1234 .

How can a feature toggle be implemented?

  • if/else flag;
  • visibility flag in the UI;
  • using the strategy pattern to abstract particular algorithm implementations.

If you use Spring, then its facilities may be used to inject a proper instance depending on a feature flag, e.g., with profiles or the @ConditionalOnProperty annotation:

@Configuration
public class CalculationAlgorithmConfig {

@Bean
@ConditionalOnProperty(
name = "FT_1234",
matchIfMissing = true
)
public Algorithm oldAlgorithm() {
return new OldAlgorithm();
}

@Bean
@ConditionalOnProperty(name = "FT_1234")
public Algorithm newAlgorithm() {
return new NewAlgorithm();
}
}

If you need to change the behavior at runtime by switching a feature ON, you could wrap it in a proxy that selects a particular implementation depending on the condition.

Should Feature Toggles be revertible?

Should you allow switching a feature toggle off again? In a general case, this would not work. For instance, you might have already populated the database with the data inconsistent from the point of view of the pre-toggle implementation. Making sure the feature can be switched back off is another point of bifurcation that adds significant complexity.

How to test feature toggles?

Automated testing of feature-toggled code can be done as follows:

  • for mocked unit tests, you can mock the feature toggle configuration class;
  • for integration tests, you can either mock the feature toggle configuration class or override the configuration properties; e.g. if you use Spring, you might do this with the @TestPropertySource annotation;
  • for system tests, you use the best practice of only testing “all off” and “all on” states; you extract all tests for feature-toggled functionality to a test suite and run it as a separate step of the test pipeline; after running the main suite, the system should be switched off, the “all on” setting applied, and the system switched back on; the “all on” setting should be treated by your feature configuration class as always returning true for every feature flag.

Do you need any frameworks, or do you write your solution?

Feature toggles vary greatly in the implementation, but they are usually quite trivial, and most of the functionality of the libraries might not be relevant for you. I would propose to implement feature toggles manually and then see if you have any patterns or duplication that might need to be extracted.

Feature Toggle Best Practices

The number of toggle points should be reduced to an absolute minimum. The toggle points should be localized; in the best case, they should be placed at the edge (the place where the request enters your system), where you have the most context as to how to process the request. Toggle points in the core should be used sparingly.

A good practice is de-coupling decision points from decision logic: instead of directly checking for a feature flag, abstract it under business-relevant methods, and use a specific method for each toggling decision, e.g., implement a FeatureDecisions class:

if (featureDecisions.useNewCalculationAlgorithm()) {
doThis();
} else {
doThat();
}

Feature Debt

As soon as the feature is tested and rolled out to all environments, including production, the feature toggle code becomes technical debt, which is often called feature debt. It may include:

  • old data structures kept around just in case, to make sure you can easily fix migration errors if they arise after production release;
  • relaxed constraints on database and APIs to simplify migration;
  • feature flags in the code;
  • any dead code that is no longer used after a feature is toggled on all environments;
  • moving new system tests to the main test pipeline.

Conclusion

This is an opinionated view of the feature toggles related to my work. You may have different requirements, and some other approach might work for you. In this article, I’ve tried to present a holistic approach to reducing the feature branch lifetime and moving towards trunk-based development using a set of approaches and best practices. Hopefully, this will help you on your developer journey.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Forketyfork

Software developer @ JetBrains Space. I mostly write about Java and microservice architecture. Occasional rants on software development in general.