Productivity Plow, Clear a Path to Success
Many organizations build out teams around 'Engineering Productivity.' Often those teams are repurposed from a QA organization that gets repurposed when QA folded into engineering. Those efforts aren't generally successful. In the end, they don't produce any impact on engineering productivity. They don't impact developer productivity because the organization's structure and management directly drive developer productivity. A group off to the side can produce tools to help with the effort to improve productivity, but they can't improve productivity themselves.
You must address that engineering productivity problem directly. You can't spin up a separate group, say 'attack that problem,' and call it solved. To solve it, you have to attack the root cause of the problem with organizational changes. The root causes of productivity problems are Waiting, Rewriting, and a Lack of Focus. To solve those problems, you must assume that your engineers are skilled, mature professionals and build systems that help those skilled professions avoid those pitfalls. Those systems need to remove artificial waits. They need to provide information so that the engineer knows what's available and what's not, and they need to help the team focus as much as possible.
Waiting
Organizations inadvertently bake little wait states into every activity. Each of those states sap productivity and motivation. When something outside of your control forces you to wait, you eventually lose interest in the thing you are trying to accomplish. Ultimately, your best engineers leave because they can't accomplish anything. What talented person with lots of opportunities wants to spend their time hobbled by things outside of their control? Fortunately, we can identify the common states, and there is a standard set of tools that you can implement to eliminate many of the typical wait states.
Waiting on Other Team Members
Wait states built into teams often come from skill silos. When a Front-end Engineer needs something from a Backend Engineer, or an engineer needs a QA person to sign off on something, waits happen. Each of those people represents a skill silo. These skill silos cause queues of work to form between them. Those queues introduce latency. Remove those silos relentlessly. If an engineer needs to test something to complete a story, the next available engineer tests it. If a Front-end engineer needs a change to a backend API and they can make that change, they should. In hiring, you should prefer generalists rather than specialists. You should encourage and support people in branching out into other areas that broaden their skills and reduce silos. Give them the time and reward them for it. Shift the team's mindset from 'my job is to write code' or 'my job is to test' to 'my job is to deliver the product.' Over time, your engineers will stop focusing on getting the work into the next queue and start focusing on getting the work done.
Another insidious source of latency is when an engineer becomes an owner of a part of the codebase. Inevitably, the team starts routing changes through that person. They become a bottleneck to anything done. When you hear 'We can't deliver that because Joe is on vacation' or 'We won't be able to deliver that for two weeks because Bhargavi is working on something else,' you know you have this problem. That person does know the codebase, and they can do work in that part of the codebase faster than anyone else. So it always feels quicker for that engineer to work on the area of the codebase. Unfortunately, it kills you in the end because it introduces a bottleneck and a new queue. Make sure that no single engineer owns any single part of the codebase. When work on a particular area comes up, other engineers on the team need to pick it up. In my experience, you have to push that change and foster the team's ownership mindset. Your manager is the driver here. They should drive by managing the flow of work to such that engineers pick up work in unfamiliar areas of the codebase. On-call rotations can also help a lot with this.
Waiting on Other Teams
No matter how good you are, how great your teams are, or how minimal those dependencies are, hard dependencies between teams kill productivity. Teams are never fully aligned, they always have competing priorities, and those priorities will not align. You are deluding yourself if you believe otherwise. Dig in, look at the data, and recognize the reality. The only fix is to remove those hard dependencies. Attack them as if they were spreaders of the plague and your survival depended on it because, in many ways, it does. Create hard boundaries between teams. Ensure your team's own full vertices from the infrastructure up through the UI. Where you can't do that, create well defined, well-managed APIs with a robust schema between teams.
Do not create teams whose whole focus is on a horizontal slice of the system. Great examples of this are DevOps or QA. Where those teams exist, they should provide consultancy services and self-service tooling, rather than owning a piece of the process itself. In short, ensure that teams get out of each other's way.
Waiting On The Process
Finally, don't create a process that causes wait states. Your teams are struggling and not producing quality software? A blocking design review committee is not the answer. Are Bugs being released into production and not being caught before they go out? A release approval committee is not the answer. You must endeavor to avoid building in waits into your processes. Design processes that achieve the goal without blocking. For example, if you want better outcomes in design reviews, attach one of your Principal engineers to the engineer doing the review and have that Principal Engineer walk with the engineer doing the design work. That gives you better outcomes, an explicit opportunity for your Principals to mentor the more junior engineers, and doesn't introduce any blocks. The core criteria for designing any process is not to have cross-team waits.
Rewriting
Engineers reinvent the wheel—all the time. There is nothing quite as demoralizing as coming into a new organization and learning that there multiple versions of the same product. There is nothing as time-wasting as building the same feature again for each version of the product. Equally insidious is rewriting functionality and services because the team didn't know they existed. Reinventing the wheel happens for two reasons. The engineer doesn't trust the existing solution and wants to code their own (Not Invented Here), or the engineer isn't aware of the current solution. There is a subset of this problem where the solution is known but not available in the platform/language/systems that the engineer uses.
To solve this, figure out a way better way to share information. The thing that works tends to be different from organization to organization. The solution tends to be something that has grown organically within that organization. Backstage is an excellent example of this. It's a tool that grew up within Spotify. At one organization I was a part of, repo level documentation was written in Asciidoc and published automatically into a Confluence wiki using the Confluence Publisher. It worked rather well.
What is going to work for you will depend a lot on your culture. The path forward is to give it thought, decide on a solution and drive it. Just a note, having a wiki doesn't cut it. The information needs to be tied tightly to the code, or it's going to fall into uselessness quickly.
Lack of Focus
A Lack of Focus ensures that you work a lot but produce very little. 80% of ten things is less useful than 100% of one thing. Lack of focus has two sources. The first is that the team isn't organizing their work well. The second is that the team is not managing their operational work to allow them to focus on the non-operational work. Solving these two problems is closely intertwined.
Organize The Work
I wrote briefly about this in 'Get Your Teams To Estimate Well,' and I will be writing more about it shortly. You must manage the team's work. The team has to know what the priorities are and focus on delivering those priorities. You need to leverage whatever tools you use well. Can your team see what the current goals are? Not the current story, the current goals. Can your engineers see what piece of work they should pick up next? Can your team see how they are doing against timelines? If the answer to any of these questions is no, then you have some work to do.
Finally, can you see what the team is doing? Can you focus the team on one major thing at a time? Can you shepherd individuals to other areas of the codebase? If the answer to these questions is no, you have work to do. How you do this work depends on the tooling and the team. I will walk you through how to do this in Jira with Advanced Roadmaps in a future article.
Remove Distraction
Your team is probably getting killed with the operational load. It's not apparent, but each team member is probably talking about something operational they did for some internal customer every day at standup. You have to kill that. They destroy focus. You must address your operation load intentionally in a way that does not allow it to impact engineers working on committed work. You need an on-call rotation.
An on-call rotation is having an engineer dedicated to handling operational load. The specific engineer changes on some cadence, usually every week to two weeks. This engineer works exclusively on operational work and does not pick up commitments. If there is no incoming operational work, then they work on fixing bugs. Resist the urge to give them committed work. It's going to screw you in the long run. Also, resist the urge to hire an engineer to focus on on-call so the other engineers can work. On-call is an excellent opportunity to spread knowledge. Exposing engineers to the problems in the system they work on helps them make better day to day decisions.
This on-call approach also allows you to manage bandwidth you are exposing to operational load. If you have a five-person team and dedicate one person to on-call, your spend is 20%. If that person is not enough to handle the on-call load, then add another one. Now you know your operational spend is 40% (at this point, you should be thinking about how to drive it down). Dedicating a person to on-call makes that operational work manageable. Spreading it throughout the team is not.
Conclusion
Increasing productivity is a journey, not a destination. It's not going to come from dedicating a team to Productivity Engineering, though that might indeed help. It's going to come from changing the way your organization operates.