A pleasant walk through computing

Comment for me? Send an email. I might even update the post!

Agile Development - Don't Restrict. Trust

Lots of organizations try to prevent developers from making mistakes, or catch them doing so, leading to more bureaucracy and over-control. Some examples include:

  • Requiring all teams to follow the exact same process.
  • Using permissions to keep developers from rebasing code with Git.
  • Managers external to the team running compliance reports on whether user stories are filled in "correctly."
  • Penalizing developers using development issues/bugs as a metric.

Let's look at that last one more closely. One of the DORA group's Four Metrics is to capture how often release failures occur. One way to think of that is "how many severe bugs are we getting after release?"

It's tempting, then, to also track how many issues/bugs are found during development, the notion being that reducing bugs during development naturally reduces them in release. But this is wrong headed. There could be value in looking at issues found during development, but it's wrong to use this to evaluate, praise or punish developers. Why?

  • More issues found during development isn't good or bad.
  • Context is missing if you're only looking at numbers.
  • Whether and how development issues are reported is highly person-dependent.

In short, you can't tell whether a developer is doing a good job based on development issues. That isn't why you'd capture development issue metrics. You would capture them as one metric to compare the effects of making development changes. For example, a team commits to increasing unit test quality. What affect did that have on the number of issues found?

What are you even tracking? Is it the number and severity of issues found by QA? That might be OK. But how do you know that Billy has the same standard for "severe" as Claudia? Maybe Rahim breaks down an issue into more bug reports than Luciana.

This is when bureaucratic management tries to get everyone working the same way with endless meetings on "what does severe mean?" and "what's the right way to report issues?" Some of this matters, sure. But thinking this way more often misses the point. Its focus is on treating people as replaceable resources, as machines.

The coercive style of management has these qualities:

Restrict. Prevent. Catch. Punish.

Those qualities can be summed up as "Developers, we don't trust you."

What if, instead, we use a reliable metric--change failure rate--to guide improvement? Further, what if we let the teams figure out how to make that improvement, allow for mistakes, and measure what matters? We'd end up with a trusted team.

In a Trust Team environment, you

  • Give your developers more permissions.
  • But also keep audit trails so if something goes wrong we know who did what.
  • Create a robust system that can be recovered quickly.
  • Notify developers of problems automatically and immediately.
  • Help understand what caused issues and how to improve.

Again, let's dig into that last one and contrast to an untrusted team.

In the untrusted team, if a severe bug is discovered, several assumptions are made.

  • One person is at fault. The team isn't treated as a unit.
  • The developer made a mistake. There's someone to blame.
  • The bug could have been prevented.
  • There needs to be a meeting to put procedures in place to keep it from happening again.
  • Non-team management is needed to enforce the new process.

In a trusted team,

  • The team takes responsibility, knowing multiple people work on the code.
  • The developer may have made a mistake. But there's no blaming because mistakes are part of development.
  • Not everything is in our control. Software's very complex. The developer may have made the right decision, even though there was a poor outcome. You can't prevent all bugs.
  • Adding process adds complexity to an already complex business. Sometimes process needs improvment. But often there's nothing to fix.
  • As soon as external management insists on compliance reports, the team ceases to be self-managing. Quality will go down, because the focus shifts from building software to pleasing management.

In the future, when something goes wrong:

  1. Determine and explain the impact on the overall business.
  2. Let the team sort things out.
  3. Ask them to inform management what, if anything, will change.
  4. Trust their answer.

Build trust, not scapegoats.

2021 DORA Explorer - My Highlights from the State of DevOps report

In my opinion, for what we do as our business, the DORA group's work is among the most important for us to understand and use. It's the best resource I know to provide software delivery guidance that's based in evidence, not hearsay and personal opinion.

I recommend downloading and reading the complete PDF from the web site.

2021 Accelerate State of DevOps report addresses burnout, team performance

Important
ALL excerpts below are directly quoted via copy/paste from the linked 2021 DevOps report. Extra emphases are mine. They're what stood out to me, and they won't be what stand out to you!

The 5 Metrics

With seven years of data collection and research, we have developed and validated four metrics that measure software delivery performance. Since 2018, we’ve included a fifth metric to capture operational capabilities.

Note that these metrics focus on system-level outcomes, which helps avoid the common pitfalls of software metrics, such as pitting functions against each other and making local optimizations at the cost of overall outcomes.

Cloud

Respondents who use hybrid or multi-cloud were 1.6 times more likely to exceed their organizational performance targets than those who did not.

Unsurprisingly, respondents who have adopted multiple cloud providers were 1.5 times as more likely to meet or exceed their reliability targets.

For the third time, we find that what really matters is how teams implement their cloud services, not just that they are using cloud technologies. Elite performers were 3.5 times more likely to have met all essential NIST cloud characteristics.

  1. On-demand self-service Consumers can provision computing resources as needed, automatically, without any human interaction required on the part of the provider.
  2. Broad network access Capabilities are widely available and can be accessed through multiple clients such as mobile phones, tablets, laptops, and workstations.
  3. Resource pooling Provider resources are pooled in a multi-tenant model, with physical and virtual resources dynamically assigned and reassigned on-demand. The customer generally has no direct control over the exact location of the provided resources, but can specify location at a higher level of abstraction, such as country, state, or data center.
  4. Rapid elasticity Capabilities can be elastically provisioned and released to rapidly scale outward or inward with demand. Consumer capabilities available for provisioning appear to be unlimited and can be appropriated in any quantity at any time.
  5. Measured service Cloud systems automatically control and optimize resource use by leveraging a metering capability at a level of abstraction appropriate to the type of service, such as storage, processing, bandwidth, and active user accounts. Resource usage can be monitored, controlled, and reported for transparency.

SRE and DevOps

While the DevOps community was emerging at public conferences and conversations, a like-minded movement was forming inside Google: site reliability engineering (SRE). . . . SRE is a learning discipline that prioritizes cross-functional communication and psychological safety, the same values that are at the core of the performance-oriented generative culture typical of elite DevOps teams.

In analyzing the results, we found evidence that teams who excel at these modern operational practices are 1.4 times more likely to report greater SDO performance, and 1.8 times more likely to report better business outcomes.

Typically, individuals with a heavy load of operations tasks are prone to burnout, but SRE has a positive effect. We found that the more a team employs SRE practices, the less likely its members are to experience burnout.

Documentation

This year, we looked at the quality of internal documentation, which is documentation–such as manuals, READMEs, and even code comments–for the services and applications that a team works on. We measured documentation quality by the degree to which the documentation:

  • helps readers accomplish their goals
  • is accurate, up-to-date, and comprehensive
  • is findable, well organized, and clear

We found that about 25% of respondents have good quality documentation, and the impact of this documentation work is clear: teams with higher quality documentation are 2.4 times more likely to see better software delivery and operational (SDO) performance.

Security

[Shift left] and integrate throughout As technology teams continue to accelerate and evolve, so do the quantity and sophistication of security threats. In 2020, more than 22 billion records of confidential personal information or business data were exposed, according to Tenable’s 2020 Threat Landscape Retrospective Report.6 Security can’t be an afterthought or the final step before delivery, it must be integrated throughout the software development process.

Consistent with previous reports, we found that elite performers excel at implementing security practices. This year, elite performers who met or exceeded their reliability targets were twice as likely to have security integrated in their software development process.

Technical DevOps capabilities

Our research shows that organizations who undergo a DevOps transformation by adopting continuous delivery are more likely to have processes that are high quality, low-risk, and cost-effective.

Specifically, we measured the following technical practices:

  • Loosely coupled architecture
  • Trunk-based development
  • Continuous testing
  • Continuous integration
  • Use of open source technologies
  • Monitoring and observability practices
  • Management of database changes
  • Deployment automation

We found that while all of these practices improve continuous delivery, loosely coupled architecture and continuous testing have the greatest impact

Elite performers who meet their reliability targets are 5.8 times more likely to leverage continuous integration. In continuous integration, each commit triggers a build of the software and runs a series of automated tests that provide feedback in a few minutes. With continuous integration, you decrease the manual and often complex coordination needed for a successful integration.

COVID-19

What reduced burnout?

Despite this, we did find a factor that had a large effect on whether or not a team struggled with burnout as a result of working remotely: culture. Teams with a generative team culture, composed of people who felt included and like they belonged on their team, were half as likely to experience burnout during the pandemic. This finding reinforces the importance of prioritizing team and culture. Teams that do better are equipped to weather more challenging periods that put pressure on both the team as well as on individuals.

Culture

Broadly speaking, culture is the inescapable interpersonal undercurrent of every organization. It is anything that influences how employees think, feel, and behave towards the organization and one another. All organizations have their own unique culture, and our findings consistently show that culture is one of the top drivers of organizational and IT performance. Specifically, our analyses indicate that a generative culture–measured using the Westrum organizational culture typology, and people’s sense of belonging and inclusion within the organization– predicts higher software delivery and operational (SDO) performance. For example, we find that elite performers that meet their reliability targets are 2.9 times more likely to have a generative team culture than their low-performing counterparts.

Our results indicate that performance-oriented organizations that value belonging and inclusion are more likely to have lower levels of employee burnout compared to organizations with less positive organizational cultures.

Given the evidence showing how psycho-social factors affect SDO performance and levels of burnout among employees, we recommend that if you’re seeking to go through a successful DevOps transformation, you invest in addressing culture-related issues as part of your transformation efforts.

Azure DevOps Locally-Hosted Build Agent With Global NPM/.NET Tools

BLUF

Terse examples of installing NPM global packages and .NET global tools for use by locally-hosted Azure DevOps build agents. Removes having to install as part of the pipeline YAML, and reduces chances of contention if multiple agents run on same machine.

Basically, install to a folder that's on the PATH.

.NET Tools

The instructions below assume an E: drive. Alter to fit your server.

Once, as Administrator

# Prep global tools directory
$dotnetTools = "E:\dotnet-tools"
New-Item $dotnetTools -ItemType Directory
$path = [Environment]::GetEnvironmentVariable('PATH', 'Machine')
$newpath = $path + ";$dotnetTools"
[Environment]::SetEnvironmentVariable("PATH", $newpath, 'Machine')

To view what's already installed.

$dotnetTools = "E:\dotnet-tools"
dotnet tool list --tool-path $dotnetTools

To install.

$dotnetTools = "E:\dotnet-tools"
dotnet tool install dotnet-ef --version 3.1.5 --tool-path $dotnetTools

NPM

Once, as Administrator
Before doing these steps, confirm the directory isn't already on the machine PATH. Basically, find out where NPM packages, if any, are already installed. Below is the default folder.

# Prep global packages directory
# Verify NodeJs is installed here:
$nodePath = "C:\Program Files\nodejs"
$path = [Environment]::GetEnvironmentVariable('PATH', 'Machine')
$newpath = $path + ";$nodePath"
[Environment]::SetEnvironmentVariable("PATH", $newpath, 'Machine')

To view what's already installed.

$nodePath = "C:\Program Files\nodejs"
npm prefix
npm config set prefix $nodePath
npm list --global --depth=0

To install.

$nodePath = "C:\Program Files\nodejs"
npm prefix
npm config set prefix $nodePath
npm install --global vsts-npm-auth
npm install --global azure-functions-core-tools@3 --unsafe-perm true
npm install --global aurelia-cli@1.3.1
npm install --global @angular/cli@12.0.1
npm install --global nswag