Overhead and Effectiveness

Overhead and Effectiveness

Getting the best from a team seems to be a common theme for me at the moment. I have written previously about the differences between efficiency and effectiveness. In that post I highlighted that striving for efficiency only does not always guarantee success.

Another aspect is overhead. Reducing overhead may make people feel that they are being more efficient but as before it doesn’t always make the team more effective. Likewise moving from next to no overhead to some may made people feel less efficient but it can sometimes mean the group is more effective.

The best way to describe this is through a couple of diagrams.

overhead 1

Here there is no overhead. There is one person performing a task. They are interpreting the input, doing some work in the best way they can and then compiling the output. From input to output the work takes a finite amount of time.

overhead 2

When you ask three people to do the same task there must be coordination.This is overhead.  This is represented in red in the diagram.

At first the group needs to understand the task and agree an approach. During the work they may have to check back with each other. This communication takes time. Sometimes an individual may not get an immediate response so they have to wait. They are blocked. Over time the group works together finally compiling the output. If you look closely the same task took three people slightly longer than one.

Why?

There was more overhead, more waiting around which makes three people else effective than one.

overhead 3

Agile frameworks such as Scrum accept there needs to be some overhead to enable the team to work together. However it arranges the overhead in blocks which should enable the team work more efficiently between them. The theory goes that you are accepting some overhead to make the team more effective. You are trading off individual efficiency for the greater good.

The key is adapting the overhead for a particular situation. That means tuning the sprint length, deciding what you actually need to do as a group in your sprint planning. The other aspect is to measure, inspect and adapt your ways of working to see if it possible to be even more effective.

Advertisements

What I have learned about building solutions with Visual Studio Online

What I have learned about building solutions with Visual Studio Online

I have used Visual Studio Team Services build system a few times in the past so I have a basic understanding of how to configure it. Recently I have been trying to get an AngularJS / .NET solution building on this system. Whilst doing this I discovered some new things I didn’t know before.

The Hosted Build Agent can’t do everything

As an AngularJS solution there are a lot of JavaScript bits to build. I don’t need to do much but I do need to run

npm install

and

jspm install

on a web site and run a few gulp tasks, some of which leverage JSPM.
It didn’t take long to discover that whilst Node and NPM are available through the hosted agent, JSPM is not. I did spend some time seeing if I could do a global JSPM install at the start of each build, but it proved unreliable and the build time went through the roof.

Ideally I would have discovered this earlier. To avoid you hitting the same problem take a look at this. About half way down the page the software available to the hosted agent is listed.

Setting up your own Build Agent and Integrating in VSTS is easy

So what do you do when you want to use the hosted VSTS offering but can’t use the hosted build agent?

There are a couple of options but they both start with your own Build Server. This can be on or off premise but the natural progression from the Software as a Service (SaaS) implementations such as Visual Studio Online is to use an Azure VM. I won’t go into the details about setting this up. Google will give you plenty of pointers.

Build times are key to a successful Continuous Integration feedback loop, so when you set up the VM you’ll have an interesting cost vs power decision. Go for the most powerful VM you can afford. You might be able to control costs by  scheduling the servers up time to suit the needs of your project such as powering it down overnight and at weekends. Remember you are charged for compute and for data transfer so think about how often your builds will run, where your code repository is located and whether that will cause you to incur data transfer costs.

Once you have the VM up and running setting up the build agent is quite straight forwards. This article discusses the details but in essence you:

  • Download the agent zip file from the VSTS configuration portal
  • Unzip and run it on your Build Server
  • When you run the agent you are asked to configure it by pointing it at your Visual Studio Online account giving the agent a name and configuring it to run as a service.

The final step is to configure the VM with all the software necessary to build the solution. Bear in mind none of the software on the hosted build server will be available by default so you will have to install and configure what you need.

custom build agent
This is what you’ll see when your agent is available to VSTS

 Understanding how to get all your JavaScript bits to build takes context

Once you move out of the comfortable world of Visual Studio and MSBuild you find that most of your build tools require configuration and any build task that needs them must be able to find them. And that means managing the path environment variable.

When I was looking at this I decided to try to make my build server as similar to the hosted ones as possible. This was based on the assumption that if I did this I would reduce any unexpected problems.

I used the script in this post to determine the environment variables as seen by the build server. By creating a simple one step build task that ran this script I could switch between hosted and my custom build agent to get the path as close as possible.

There is one other gotcha worth mentioning. By default node.js configures NPM to use the currently logged in user’s roaming profile folder for storing downloaded packages. This is not appropriate for a build server so you should customise the build server’s node.js installation to use a different path. You also need to set up the security settings on the folder to enable the correct access. This stack overflow post discusses using

npm config --global set prefix “<new_npm_folder>"
npm config --global set cache "<new_npmcache_folder>"

That’s what worked for me.

When is a bug not a bug?

When is a bug not a bug?

Testing is a very different skill set to software development. Testers look at the problems that software developers are solving through a different lens. They are analytical and are ruthless in finding the problems hidden within the most finely crafted software.

In traditional software delivery methodologies, testing follows development, just as night follows day. To development teams, testing is hindrance and to testers, software developers are people who cut corners and put the team under pressure to release sub-standard software. The two roles are working against each other which can made for a stressful working environment.

In Agile delivery methodologies, rather than fighting each other, the two roles work together. Testers help developers build code that is easier to test and developers show testers at first hand just how complex software development can be. The roles combine to produce high quality automated testing solutions.

As I said above testers are analytical. They are driven to find problems and when they do find them they want to document what they have found. They describe how to reproduce the issue, its severity, based on a set of clear classifications and then build up a log of all discovered issues so they can determine whether they have been fixed or to be used to determine if issues reoccur in the future. What they like to do is find and then document bugs.

In Agile delivery things are different. An user story is either done or it isn’t. Testers have a valuable role in determining whether a story is done. Done could mean that the story is released to Live or more likely it has been deployed to a pre-production environment, it is working correctly and it is waiting to be deployed into production by another team. When the story is started the developer and tester define the acceptance criteria for the story. These are the criteria that defines if the story is done. These can be turned into automated acceptance tests but they could simple be a set of tasks that are checked manually. Development on the story continues until the acceptance criteria are met.  If the tester finds that the criteria are not met the developers undertake more tasks until they do.

Where is the documentation that ensures that issues found during development are not reintroduced?

These are the acceptance criteria for the story and this is why is a useful to build them up and automate them over time. Once they are in the acceptance test suite a successful run gives everyone confidence that no issues have been introduced. If it fails you can usually pinpoint the problem by identifying which test has failed. So in my opinion an issue found by a tester in a story that is in progress is not a bug, it is simply represents more work.

Does this mean that there are never any bugs?

This approach greatly reduces problems getting into live and also helps you to avoid reintroducing problems. However that is not to say that your software is problem free. Customers may report problems or exploratory testing might find issues not covered by acceptance testing. Finally automated error logging may identify problems neither your customer nor your testers can see.

These are bugs and they should enter your backlog in the same way as any other work. Once identified it is up the tester to provide a severity, from the product being completely broken and inaccessible, through to the product not working as designed but there being work arounds, to simple cosmetic issues. Severity should not be confused with priority. They can be related but not always. For example a typo is low severity bug but it is high priority in the backlog if the typo is your best customers name!

In traditional software development, long list of bugs were sometimes treated like a badge of honour by the testing team. It demonstrates the effectiveness of the testing team. Test exit reports which include counts of bug by severity become a bargaining chip between customer and supplier as to whether a product can be put live. In Agile software development low bug counts are worn as a badge of honour by the whole team which indicates the quality of the software they are delivering.

 

 

Business Value is not just “Business” Value

Business Value is not just “Business” Value

Being a Product Owner on a large Agile project is hard. Not only are you trying to get a consensus from numerous business stakeholders, all with their own opinions and agendas, you are having Scrum masters asking for your undivided attention for release and sprint planning. Not only do you have to juggle the priority of all the business features that need delivering for particular business milestones you are expected to understand the relative priority of technical improvements, tech debt and bugs. Most Product Owners understand the business better than the technology so when they are asked to determine the priority of “Refactoring a data layer to deal to the transient failure of a cloud database connection” vs “building a support tool to replay failed messages” you can’t blame them if they struggle.

In many organisations, despite moves towards agile ways of working there are still delivery pressures and there are still expectations that a fixed scope will be delivered in a fixed time. So when an agile project has such a focus on milestones and there are feelings outside the project that those milestones are not being met there is an understandable motivation for the PO to move the product forwards. The PO starts to think only in terms of features and all the other quality criteria such as does the feature perform, is it secure, can it be modified easily in the future, can it be supported, become hindrances. Inexperience POs forget they need to represent the concerns of the whole business and not just the users of the product.

I have worked on projects where  there was tension between the Architecture and Product Owners. The Product Owner supported by the Scrum master wanted to focus purely on business value in terms of features delivered to hit a business milestone. Emerging design was promoted over any type of non-functional planning. This resulted in a number of stories being created to fix security, operational, and performance problems after the fact. And these types of stories were viewed with suspicion by the Scrum Master and the PO, leading them to drop down the backlog.

Over time this friction and tension caused collateral damage. It became difficult to justify

  • Resolving technical debt
  • Implementing good integration patterns
  • Fixing technical bugs
  • Generally doing anything that reduced technical risk
  • Building tools to help detect data integrity problems

This work was always challenged because it was seen as though it wasn’t the highest business value. “Of course delivering feature X is more important than spending time building a tool to identify and fix data corruption”. The projects preoccupation with milestones meant the they could not see beyond delivering new features. This work was seen as just something, “the architects’ wanted” or “the development team were moaning about”.

Business value means more than new features.

Is Business Value really the rate at which you can deliver new features?

If technical debt mounts up it generally becomes harder and harder to add new code to the solution and maintain existing code. Delivering new feature slows down.

Those architectural waivers for non-compliance will expire at some time . Soon you will be forced to resolve them as you’ll not be able to release anything else until you do. Now the project cannot release any new features until those problem are resolved.

A bad integration implementation means that integrating with additional systems becomes more difficult. The effort to maintain existing integration can grow exponentially. Again over time building new features gets harder.

The list of bugs is going nowhere when focusing on new features. Perhaps your extensive bugs are causing you to lose customers and  soon there will be no one left to use your new features.

Addressing all of these things in a timely manner means you can continue to deliver new features at a consistent rate. Not addressing them slows down the rate in which the project can deliver new things. Therefore when assessing the business values of these types of technical stories perhaps you are looking at the wrong thing. Maybe you should look at their potential to reduce the rate at which new business value can be delivered if they are not done.

Dealing with incoming work

Dealing with incoming work

I recently had first-hand experience of my local hospital’s Accident and Emergency department. Luckily it was nothing serious. Just another case of a grown man who should know better.

My local hospital, like many in the UK is very busy. People turn up to A&E departments with relatively minor injury’s like myself, through to urgent and life threatening conditions.  Work comes into the department in various ways, whether that be someone simply turning up off the street, a referral from a GP or the 111 service or in an ambulance. The nurses, doctors and support staff in the department are finite resources and the department is also under pressure to see all patients within 4 hrs.

In order to deal with unpredictable workload the A&E department uses a triage process to quickly assess all new patients. Here the patient’s condition is assessed. Sometimes information is already known because the patient has been referred or it comes from the paramedic crews.  In other cases the patient has to answer a number of questions and may have to be examined.  The result is a priority ordered queue where the most serious cases are at the top and the less urgent at the bottom. I was interested to note that the department provided dedicated resources for the triage process ensuring work was prioritised as quickly as possible.  I also could see that the team had a feed of inbound emergency cases with what I assumed was a brief description of the case. I guess that this would give the department enough time to scale up its resources in the case of a serious emergency situation with multiple casualties.

A person walking up to the A&E reception is like new work coming into an Agile delivery team. Work needs to be captured, understood and prioritised quickly. It would be surprising if the new work coming into an Agile team was a constant flow so it is usually unnecessary to have a permanent triage process but it is important to recognise one is required and to establish a routine for assessing new work. The result of the triage process is an item in the project backlog. Its position represents its priority. In incremental Agile models that work could be a candidate for the next sprint. In a pure pull model it could be at the top of the backlog and the next item the team deals with once someone is free.

In an incremental model the underlying assumption is that you can always wait for the next sprint to deal with the new work. Adding new work to the current sprint is an exception rather than a rule because the sprint boundary is intended to provide a calm and tranquil space for the team. Changing priority can be disruptive and reduces team effectiveness. Pull models deals with this better because you only have to wait until someone in the team is free. The time to wait is a function of the average time to complete work items. Therefore it follows, the smaller the size of the work item, the lower the wait time.

But what about the serious emergency cases where time is of the essence? It may be the case that waiting for a member of the team to be free is too long. In IT this might be represented by a critical live issue reported by your best customer. Here you might not be able to wait for two, three or four weeks for a sprint to end. Going back to my observations at the hospital it was clear the department was split into two. Major and minor cases were physically separated. I can only speculate at this point but I am guessing that there are two resource pools dealing with major and minor cases separately. This mirrors the way that some organisations have dedicated teams that deal with support fixes and others that deal with new features.

When setting up a single team you have to choose a methodology based on the types of changes that are coming through. If you are building a new product and you have limited customers there is a small window for major issues to occur so an incremental model could be a good fit. Major issues are rare and an exception and are treated as such. However if support issues are the bulk of the team’s work than a pure pull model may be better. As your product develops you may change from an incremental model to a pure pull model or you may set up different teams working in different ways.

The final thing I want to highlight is the way that work is allocated. It is obvious that in the A&E situation the team must be able to deal with any case. It would be unacceptable if a serious  case was not looked at simply because the next available team member couldn’t work that case. In some situations the team might have to self-organise to ensure the right team member is working the right cases but re-organisation is focused around priority order and not personal preference. A team member coming free doesn’t browse the backlog looking for a case they’d like to do, they do the next one. And finally when in the middle of a procedure with a patient a doctor doesn’t stop what they are doing to start working on something else. They keep work in progress minimised and finish the procedure they started. This is fundamental to ensuring work flows and the team is effective.