Showing posts with label Software Development. Show all posts

Tuesday, September 03, 2019

Labels: Project Management, Software Development

Agile Is Not the Answer to All Your Failed Software Development Projects

Agile means many things to many people. I believe that the simple high-level Agile Manifesto is close to the way most engineers think about software development. Here's a quick rundown:

> Individuals and interactions over processes and tools

> Working software over comprehensive documentation

> Customer collaboration over contract negotiation

> Responding to change over following a plan

However, once you get past these high-level ideas into the details, the agreement starts to fade. Agile has some good ideas but it also has problematic elements which are too centered around short-term thinking to work on large and complex engineering projects done not only at huge companies like Google, Amazon and Microsoft, but also in non-tech industries. For example, this could include building a core banking system, a high-frequency trading platform, a drug discovery system, or ERP software.

Without getting buried in details, let’s look at some of the principles behind the Agile Manifesto.

> Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.

> The best architectures, requirements, and designs emerge from self-organizing teams.

> At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

> Continuous attention to technical excellence and good design enhances agility.

> Simplicity — the art of maximizing the amount of work not done — is essential.

These principles are almost common sense for smart engineers, and have been known and applied for years. NASA of the sixties, I am looking at you now.

However, there are other parts of the principles which are not a part of a healthy large and complex project development culture. These are the parts which have led to the short-term-focused Scrum process. They seem suited to particular types of development, most notably consulting or contract programming, where the customer is external to the organizations, runs the show because they are paying for development, and can change their mind at any time:

> Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

> Business people and developers must work together daily throughout the project. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

> Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.

> Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

This style of short-term planning, direct customer contact, and continuous iteration is well suited to software with a simple core and lots of customer-visible features that are incrementally useful.

It is not so well suited to software which has a very simple interface and tons of hidden internal complexity, software which isn’t useful until it’s fairly complete, or leapfrog solutions the customer can’t imagine.

Companies like Google, Microsoft and Amazon write revolutionary software which has never been written before, and which doesn’t work until complex subcomponents are written.

This type of innovation takes significant up-front design time, and requires working on components over longer periods than one-week iterations. Because the projects have such simple external interfaces and so much internal complexity, much of the work is not even visible to “customers,” so there is no way to write customer-visible stories about it. With this type of software it takes 9–24 months to deliver the first working version to the customer.

Projects like this are the anti-Scrum. They represent extremely long-term thinking on the part of the technical leaders. Instead of working on something that would meet a small need this week, they were laying a foundation for a fundamental shift in the way cluster software was developed. That investment has not only reaped incredible rewards at Google, Microsoft and Amazon, but has influenced the entire industry.

Other industries have similar analogs. From tax-accounting software to computer games, some software is not suited to give to end customers when partially finished.

If I was asked to rewrite the above Agile principles to be more in-line with large and complex style software development, they might look something like this:

> Our highest priority is to increase customer (and programmer) productivity and access to information. Work on the biggest, most frequently used problems you can find, and create the largest net impact. Don’t give the customer what they ask for; understand them, and revolutionize their world.

> Developers should create a project charter (a fairly minimal but structured design doc), detailing the project, outlining what goals it hopes to achieve, and explaining why it can’t be done in other ways. This document should be circulated with stakeholders to get early feedback before the project gets underway. The written record is essential, as it assures there is a clear and agreed understanding of when the project is a success and how it aims to get there.

> At all phases of the project, critical design elements for larger components should be concisely explained and captured in a design document.

> Innovate in leapfrogs. It’s more important to finish and deploy a leapfrog than to attempt perfection. There is no perfection. Instead, be flexible and plan to constantly reinvent at every level of the stack.

> Deliver working software as soon as is reasonably possible, and no sooner. “Dogfood” projects internally before they are shipped externally. Make sure products meet high quality standards before shipping. The quality of the product is more important than the time it takes to achieve it.

While the high-level Agile Manifesto is flexible enough to work with these principles, these are very different than the short-iteration, low-documentation Scrum/SAFe/LeSS/Nexus/DAD processes that have become synonymous with the word "Agile."

Starting with an understanding of what you are actually doing, the people you are working with, the things you are working with to get work done, and why you are doing it, should be the first priority, not starting by choosing a framework or process.

The biggest problem I am seeing is that people have made “Agile” the goal, worrying about “doing Agile right” before actually understanding the problem they are trying to solve and/or the opportunity they are trying to explore. They take it as a given that Agile is the right approach for the project, before the project is even defined.

In a nutshell: We should all start with an understanding of where we want to go before we figure out how we want to get there … not the reverse.

Tuesday, May 01, 2018

Labels: Project Review, Software Development

How to Review Your Team’s Software Development Practices

An important part of the project reviews I do is a software development practices review. Notice I do not say "best practices". The term “best practices” is without meaning in many contexts. At best, it’s vague, subjective, and highly dependent on context.

Wikipedia defines best practices as:

“A best practice is a method or technique that has been generally accepted as superior to any alternatives because it produces results that are superior to those achieved by other means or because it has become a standard way of doing things.”

In other words, a “best practice” is a practice that has been somehow empirically proven to be the best. Although there is quite some research on software development practices, I think defining the “best” software development practices, that would be the "best" for all projects, based on this kind of research is impossible.

What I am looking for in my software development practices review is the application of “standard practices” or “accepted practices”. A good example of this might be surgeons washing their hands prior to surgery. It is such a widely accepted and standardized practice that it would be scandalous for a surgeon not to follow it. Washing hands versus not washing hands have been empirically demonstrated to be better in terms of patient outcomes. Is the washing of hands the absolute “best” thing in the universe that a surgeon could do prior to surgery? Well, this philosophical question is not especially important. What matters is that hand-washing is beneficial and widely accepted.

The Review

Your review would be sitting in a room with the development team and go through the list of practices from the second part of this article. Just ask the team about each practice and let them tell their stories. You might learn a lot of other things too! After talking about a practice the team should agree on one of the following answers:

1) We do not do this
2) We do not need this
3) We do this, but not enough/consistently
4) We do this, but we do not see the expected benefits
5) We do this and see the expected benefits

After agreeing on an answer, everybody in the team should give input on why this is the case. You could use the 5 whys technique when you feel that it helps you get important information.

Of course there are exceptions based on specific context, and of course, there are wildly varying degrees of maturity for each practice. But in general, you can say that the more times the team tells you “No, we do not do this” the more room for improvement there is. This is the positive formulation. You could also say the more “We do not do this" you count, the more issues with the software and your project you can expect.

The moment you should listen very carefully is when the team says “No, we do not NEED this”. Here you will learn about your project specific challenges and environment AND you will learn about the mindset of your team.

Answer numbers 3 and 4 will indicate possible room for improvements on the implementation of a practice and with that on the software development process as a whole.

You can combine this review with the delivery team review. The answers to the software development practices review will give you valuable information on team dynamics, mindset, individual skills, individual knowledge, as well as skills and knowledge as a team,

The Practices

Since the creation of the first programming languages in the 1950’s there currently exists a widespread agreement in the software development community on which software engineering practices help creating better software. I will explain the in my opinion most effective practices in the rest of this article.

1. Separate Development and Deployment Environments

In order to develop and deploy software effectively, you will need a number of different environments. This practice seems to be so logical, still, I see time and again that essential environments are not available. Let’s start with the first environment.

1) Development: The development environment is the environment in which changes to software are developed, most simply an individual developer's workstation. This differs from the ultimate target environment in various ways – the target may not be a desktop computer (it may be a smartphone, embedded system, headless machine in a data center, etc.), and even if otherwise similar, the developer's environment will include development tools like a compiler, integrated development environment, different or additional versions of libraries and support software, etc., which are not present in a user's environment.

2) Integration: In the context of version control, particularly with multiple developers, finer distinctions are drawn: a developer has a working copy of source code on their machine, and changes are submitted to the repository, being committed either to the trunk or a branch, depending on development methodology. The environment on an individual workstation, in which changes are worked on and tried out, may be referred to as the local environment, sandbox or development environment. Building the repository's copy of the source code in a clean environment is a separate step, part of integration (integrating disparate changes), and this environment is usually called the integration environment; in continuous integration this is done frequently, as often as for every version. The source code level concept of "committing a change to the repository", followed by building the trunk or branch, corresponds to pushing to release from local (individual developer's environment) to integration (clean build); a bad release at this step means a change broke the build, and rolling back the release corresponds to either rolling back all changes from that point onward, or undoing just the breaking change, if possible.

3) Test: The purpose of the test environment is to allow human testers to exercise new and changed code via either automated checks or non-automated techniques. After the developer accepts the new code and configurations through unit testing in the development environment, the items are moved to one or more test environments. Upon test failure, the test environment can remove the faulty code from the test platforms, contact the responsible developer, and provide detailed test and result logs. If all tests pass, the test environment or a continuous integration framework controlling the tests can automatically promote the code to the next deployment environment.

Different types of testing suggest different types of test environments, some or all of which may be virtualized to allow rapid, parallel testing to take place. For example, automated user interface tests may occur across several virtual operating systems and displays (real or virtual). Performance tests may require a normalized physical baseline hardware configuration, so that performance test results can be compared over time. Availability or durability testing may depend on failure simulators in virtual hardware and virtual networks.

Tests may be serial (one after the other) or parallel (some or all at once) depending on the sophistication of the test environment. A significant goal for agile and other high-productivity software development practices is reducing the time from software design or specification to delivery in production. Highly automated and parallelized test environments are important contributors to rapid software development.

4) Staging: Staging is an environment for final testing immediately prior to deploying to production. It seeks to mirror the actual production environment as closely as possible and may connect to other production services and data, such as databases. For example, servers will be run on remote machines, rather than locally (as on a developer's workstation during dev, or on a single test machine during a test), which tests the effect of networking on the system. This environment is also know as UAT, which stands for User Acceptance Test.

The primary use of a staging environment is to test all installation/configuration/migration scripts and procedures, before they are applied to production environment. This ensures that all major and minor upgrades to the production environment will be completed reliably without errors, in minimum time.

Another important use of staging is for performance testing, particularly load testing, as this often depends sensitively on the environment.

5) Production: The production environment is also known as live, particularly for servers, as it is the environment that users directly interact with. Deploying to production is the most sensitive step; it may be done by deploying new code directly (overwriting old code, so only one copy is present at a time), or by deploying a configuration change. This can take various forms: deploying a parallel installation of a new version of code, and switching between them with a configuration change; deploying a new version of code with the old behavior and a feature flag, and switching to the new behavior with a configuration change that performs a flag flip; or by deploying separate servers (one running the old code, one the new) and redirecting traffic from old to new with a configuration change at the traffic routing level. These in turn may be done all at once or gradually, in phases.

Deploying a new release generally requires a restart, unless hot swapping is possible, and thus requires either an interruption in service (usual for user software, where applications are restarted), or redundancy – either restarting instances slowly behind a load balancer, or starting up new servers ahead of time and then simply redirecting traffic to the new servers.

When deploying a new release to production, rather than immediately deploying to all instances or users, it may be deployed to a single instance or fraction of users first, and then either deployed to all or gradually deployed in phases, in order to catch any last-minute problems. This is similar to staging, except actually done in production, and is referred to as a canary release, by analogy with coal mining. This adds complexity due to multiple releases being run simultaneously, and is thus usually over quickly, to avoid compatibility problems.

In some exceptional cases you could do without a Test Environment and use the Staging Environment for this, but all other environments should be present.

2. Use of Version Control

Version control is any kind of practice that tracks and provides control over changes to source code. Teams can use version control software to maintain documentation and configuration files as well as source code.

As teams design, develop and deploy software, it is common for multiple versions of the same software to be deployed in different sites and for the software's developers to be working simultaneously on updates. Bugs or features of the software are often only present in certain versions (because of the fixing of some problems and the introduction of others as the program develops). Therefore, for the purposes of locating and fixing bugs, it is vitally important to be able to retrieve and run different versions of the software to determine in which version(s) the problem occurs. It may also be necessary to develop two versions of the software concurrently: for instance, where one version has bugs fixed, but no new features (branch), while the other version is where new features are worked on (trunk).

At the simplest level, developers could simply retain multiple copies of the different versions of the program, and label them appropriately. This simple approach has been used in many large software projects. While this method can work, it is inefficient as many near-identical copies of the program have to be maintained. This requires a lot of self-discipline on the part of developers and often leads to mistakes. Since the code base is the same, it also requires granting read-write-execute permission to a set of developers, and this adds the pressure of someone managing permissions so that the code base is not compromised, which adds more complexity. Consequently, systems to automate some or all of the version control process have been developed. This ensures that the majority of management of version control steps is hidden behind the scenes.

Moreover, in software development, legal and business practice and other environments, it has become increasingly common for a single document or snippet of code to be edited by a team, the members of which may be geographically dispersed and may pursue different and even contrary interests. Sophisticated version control that tracks and accounts for ownership of changes to documents and code may be extremely helpful or even indispensable in such situations.

Version control may also track changes to configuration files, such as those typically stored in /etc or /usr/local/etc on Unix systems. This gives system administrators another way to easily track changes made and a way to roll back to earlier versions should the need arise.

3. Clear Branching Strategy

Branching strategy has always been one of those sticky topics which always causes many questions. Many senior programmers are baffled by the ins-and-outs of branching and merging. And for good reason; it is a difficult topic. Many strategies exist; main only, development isolation, release isolation, feature isolation, etc.

I’ve been around in many different organizations. I’ve been the person who was told what the branching strategy was, and I have been the person who designed it. I’ve seen it done just about every way possible, and after all that, I have come to the following conclusion.

Keep it simple. Working directly off the trunk is by far the best approach in my opinion.

In a future post, I will show you what I think is the most simple and effective branching strategy. A strategy I have effectively used in the past and have developed over time. It can be summarized as follows:

1) Everyone works off of trunk.
2) Branch when you release code.
3) Branch off a release when you need to create a bug fix for already released code.
4) Branch for prototypes.

4. Use of a Bug Tracking System

A bug tracking system or defect tracking system is a software application that keeps track of reported software bugs in software development projects. When your team is not using some kind of a system for this than you are in for a lot of trouble.

Many bug tracking systems, such as those used by most open source software projects, allow end-users to enter bug reports directly. Other systems are used only internally in a company or organization doing software development. Typically bug tracking systems are integrated with other software project management applications.

The main benefit of a bug-tracking system is to provide a clear centralized overview of development requests (including bugs, defects and improvements, the boundary is often fuzzy), and their state. The prioritized list of pending items (often called backlog) provides valuable input when defining the product roadmap, or maybe just "the next release".

A second benefit is that it gives you very useful information about the quantity, type and environment of bugs/defects that are discovered. There is a big difference between finding them at the test environment versus the production environment. In general, you can say the later you find them, the more they cost to fix.

5. Collective Code Ownership

Collective Ownership encourages everyone to contribute new ideas to all parts of the project. Any developer can change any line of code to add functionality, fix bugs, improve designs or refactor. No one person becomes a bottleneck for changes. This is easy to do when you have all your code covered with unit tests and automated acceptance tests. 

6. Continuously Refactoring

Code should be written to solve the known problem at the time. Often, teams become wiser about the problem they are solving, and continuously refactoring and changing code ensures the code base is forever meeting the most current needs of the business in the most efficient way. In order to guarantee that changes do not break existing functionality, your regression tests should be automated. I.e. unit tests are essential. 

7. Writing Unit Tests

The purpose of unit testing is not for finding bugs. It is a specification for the expected behaviors of the code under test. The code under test is the implementation for those expected behaviors. So unit test and the code under test are used to check the correctness of each other and protect each other. Later when someone changed the code under test, and it changed the behavior that is expected by the original author, the test will fail. If your code is covered by a reasonable amount of unit tests, you can maintain the code without breaking the existing feature. That’s why Michael Feathers define legacy code in his book as code without unit tests. Without unit tests your refactoring efforts will be a major risk every time you do it. 

8. Code Reviews

Code review is a systematic examination (sometimes referred to as peer review) of source code. It is intended to find mistakes overlooked in software development, improving the overall quality of software. Reviews are done in various forms such as pair programming, informal walkthroughs, and formal inspections.

Code review practices fall into two main categories: formal code review and lightweight code review. Formal code review, such as a Fagan inspection, involves a careful and detailed process with multiple participants and multiple phases. Formal code reviews are the traditional method of review, in which software developers attend a series of meetings and review code line by line, usually using printed copies of the material. Formal inspections are extremely thorough and have been proven effective at finding defects in the code under review.

Lightweight code review typically requires less overhead than formal code inspections. Lightweight reviews are often conducted as part of the normal development process:

1) Over-the-shoulder – one developer looks over the author's shoulder as the latter walks through the code.

2) Email pass-around – source code management system emails code to reviewers automatically after check-in is made.

3) Pair programming – Having 2 developers work on one piece of code, using one keyboard and one monitor. Pairing results in higher quality output because it greatly reduces wasted time and defects, and results in high collaboration. It is nothing else as continuous code reviews. Hence, when implemented you do not need code reviews before merging your branches, hence continuous integration can be done faster. This is common in Extreme Programming.

4) Tool-assisted code review – authors and reviewers use software tools, informal ones such as pastebins and IRC, or specialized tools designed for peer code review.

A code review case study published in the book Best Kept Secrets of Peer Code Review found that lightweight reviews uncovered as many bugs as formal reviews, but were faster and more cost-effective. In my opinion, it does not matter what kind of code reviews you do, but there should go NO code in production that has not been peer-reviewed.

9. Build Automation

Build automation is the process of automating the creation of a software build and the associated processes including: compiling computer source code into binary code, packaging binary code, and creating all necessary artifacts to deploy the application on a target environment.

Build automation is considered the first step in moving toward implementing a culture of Continuous Delivery and DevOps. Build automation combined with Continuous Integration, deployment, application release automation, and many other processes help move an organization forward in establishing software delivery best practices

10. Automated Tests and Test Automation

In the world of testing in general, and continuous integration and delivery in particular, there are two types of automation:

1) Automated Tests
2) Test Automation

While it might just seem like two different ways to say the same thing, these terms actually have very different meanings.

Automated tests are tests that can be run automated, often developed in a programming language. In this case, we talk about the individual test cases, either unit-tests, integration/service, performance tests, end-2-end tests or acceptance tests. The latter is also known as Specification by Example.

Test automation is a broader concept and includes automated tests. From my perspective, it should be about the full automation of test cycles from check-in up-to deployment. Also called continuous testing. Both automated testing and test automation are important to continuous delivery, but it's really the latter that makes continuous delivery of a high quality even possible.

11. Continuous Integration

Martin Fowler defines Continuous Integration (CI) in his key article as follows: "Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly." You see, without unit tests and test automation, it is impossible to do CI right. And only when you do CI right you might be able to succeed at Continuous Deployment. 

12. Continuous Delivery

Continuous delivery is a series of practices designed to ensure that code can be rapidly and safely deployed to production by delivering every change to a production-like environment and ensuring business applications and services function as expected through rigorous automated testing. Since every change is delivered to a staging environment using complete automation, you can have confidence the application can be deployed to production with a push of a button when the business is ready. Continuous deployment is the next step of continuous delivery: Every change that passes the automated tests is deployed to production automatically. Continuous deployment should be the goal of most companies that are not constrained by regulatory or other requirements.  

A simple continuous delivery pipeline could look like this:

1) Continuous integration server picks-up changes in the source code
2) Starts running the unit-tests
3) Deploys (automated) to an integration environment
4) Runs automated integration tests
5) Deploys (automated) to an acceptance environment
6) Runs automated acceptance tests
7) Deploys (automated or manual) to production

13. Configuration Management by Code

The operating system, host configuration, operational tasks etc. are automated with code by developers and system administrators. As code is used, configuration changes become standard and repeatable. This relieves developers and system administrators of the burden of configuring the operating system, system applications or server software manually. 

14. Code Documentation

Inevitably, documentation and code comments become lies over time. In practice, few people update comments and/or documentation when things change. Strive to make your code readable and self-documenting through good naming practices and known programming style.

15. Step by Step Development Process Guide

This guide is essential for onboarding of new people and inspecting and adapting the way the team works. I work a lot with Kanban and ScrumBan and an important concept of these is to make your process explicit.

16. Step by Step Deployment Process Guide

Somebody who is usually not doing this should be able to deploy on production with this guide on the table. You will never know when you need it, but the day will come, and than you are happy you have this. Of course the more you go into the direction of Continuous Delivery, the smaller this guide becomes, because all documentation of this process is coded in your automated processes.

17. Monitoring and Logging

To gauge the impact that the performance of application and infrastructure have on consumers, organizations monitor metrics and logs. The data and logs generated by applications and infrastructure are captured, categorized and then analyzed by organizations to understand how users are impacted by changes or updates. This makes it easy to detect sources of unexpected problems or changes. It is necessary that there be a constant monitoring, to ensure a steady availability of services and an increment in the speed at which infrastructure is updated. When these data are analyzed in real-time, organizations proficiently monitor their services .

18. Being Aware of Technical Debt

The metaphor of technical debt in code and design can be defined as follows: You start at an optimal level of code. In the next release, you are adding a new feature. This would take an effort E. This, of course, assuming that estimations are somewhere near reality.
If the level of code was less than optimal, the effort will be E + T.

Where T is the technical debt. Writing bad code is like going further into debt. You take the loan now, and you repay the debt later. The bigger the mess, the larger the delay in the next release.

The term “technical debt” was first introduced by Ward Cunningham. It was in the early 90s when the disconnects between development and business were growing bigger and bigger. The business people would urge developers do release untested, ugly code in order to get their product or new features faster. The developers tried to explain why this was a bad mistake. Some things will never change...

Most products and projects are still released much earlier than the developers have wanted. Assuming that developers are not just being stubborn (I know, maybe an ever bigger assumption as decent estimations), you would think that we didn’t manage to get the message across to the business. We have done an awesome job explaining what technical debt is and what the results are going to be. The business people understand it. But they are just willing to take the loan now. Can you blame them? Business wants something out there, in the field, that will sell now.

No problem, just make sure the consequences of these decisions are clear for all parties involved.

19. Good Design

Good design is hard to judge, but luckily enough bad design is easy to “smell”. Software Developers are notorious for using different criteria for evaluating good design but, from experience, I tend to agree with Bob Martin and Martin Fowler who have said that there is a set of criteria that engineers usually agree upon when it comes to bad design.

And because you can't recognise good design until you know what bad design is, and once you know what good design should avoid you can easily judge whether a said engineering principle has any merit or is just fuzz waiting to distract you from your real goal of building software that is useful to people we just use bad design as a start to determine if we have a good design.

A piece of software that fulfills its requirement and yet exhibits any or all of the following traits can be considered to have "bad design":

1) Rigidity: It's too hard to make changes because every change affects too many other parts of the system.
2) Fragility: When you make a change, unexpected parts of the system break.
3) Immobility: It's hard to reuse a chunk of code elsewhere because it cannot be disentangled from its current application/usage.
4) Viscosity: It's hard to do the "right thing" so developers take alternate actions.
5) Needless Complexity: Overdesign
6) Needless Repetition: Mouse abuse
7) Opacity: Disorganized expression

Closing Thoughts

As you have noticed in the descriptions of the practices above they are “layered”. To do x, you will need to do y first. For example, Continuous Integration without Build Automation is not possible. Test Automation without Automated Tests neither. And so on. Good software development practices start with the foundational layers and then build on top. When the foundation is weak, all else will be weak as well.

Saturday, April 14, 2018

Labels: Software Development

Start Your Project With a Walking Skeleton

In order to reduce risk on large software development projects, you need to figure out all the big unknowns as early as possible. The best way to do this is to have a real end-to-end test with no stubs against a system that’s deployed in production.

You could do this by building a so-called Walking Skeleton, a term coined by Alistair Cockburn. He defined it as a tiny implementation of the system that performs a small end-to-end function. It need not use the final architecture, but it should link together the main architectural components. The architecture and the functionality can then evolve in parallel. A similar concept called “Tracer Bullets” was introduced in The Pragmatic Programmer.

A Walking Skeleton is a tiny implementation of the system that performs a small end-to-end function. It need not use the final architecture, but it should link together the main architectural components. The architecture and the functionality can then evolve in parallel.

If the system needs to talk to one or more datastores then the walking skeleton should perform a simple query against each of them, as well as simple requests against any external or internal service. If it needs to output something to the screen, insert an item to a queue or create a file, you need to exercise these in the simplest possible way. As part of building it, you should write your deployment and build scripts, set up the project, including its tests, and make sure all the automations are in place — such as Continuous Integration, monitoring, and exception handling. The focus is the infrastructure, not the features. Only after you have your walking skeleton should you write your first automated acceptance tests.

This is only the skeleton of the application, but the parts are connected and the skeleton does walk in the sense that it exercises all the system’s parts as you currently understand them. Because of this partial understanding, you must make the walking skeleton minimal. But it’s not a prototype and not a proof of concept — it’s production code, so you should definitely write tests as you work on it.

High Risk First

According to Hofstadter’s Law, “It always takes longer than you expect, even when you take into account Hofstadter’s Law”. This law is valid all too often. It makes sense than to work on the riskiest parts of the project first, which are usually the parts that have dependencies: on third-party services, on in-house services, on other groups in the organization you belong to. It makes sense to get the ball rolling with these groups simply because you don’t know how long it will take and what problems should arise.

Making changes to architecture is harder and more expensive the longer it has been around and the bigger it gets. We want to find mistakes early. This approach gives us a short feedback cycle, from which we can more quickly adapt and work iteratively as necessary to meet the business' prioritized list of runtime-discernable quality attributes. Assumptions about the architecture are validated earlier. The architecture is more easily evolved because problems are found at an earlier stage when less has been invested in its implementation.

The bigger the system, the more important it is to use this strategy. In a small application, one developer can implement a feature from top to bottom relatively quickly, but this becomes impractical with larger systems. It is quite common to have multiple developers on a single team or even on multiple, possibly distributed teams involved in implementing end-to-end. Consequently, more coordination is necessary.

No Shortcuts

It’s important to stress that until the walking skeleton is deployed to production (possibly behind a feature flag or just hidden from the outside world) you are not ready to write the first acceptance test. You want to exercise your deployment and build scripts and discover as many potential problems as you can as early as possible. The Walking Skeleton is a way to validate the architecture and get early feedback so that it can be improved. You will be missing this feedback if you cut corners or take shortcuts.

In a nutshell: Start with a Walking Skeleton, keep it running, and grow it incrementally.

Monday, April 09, 2018

Labels: Software Development

It's Never Too Early to Think About Performance

Business users specify their needs primarily through functional requirements. The non-functional aspects of the systems, like performance, responsiveness, up-time, support needs, and so on, are left up to the development team.

Testing of these non-functional requirements is left until very late in the development cycle, and is sometimes delegated completely to the operations team. This is a big mistake that is made far too often. Having separate development and operations team is already a mistake by itself, but I will leave that discussion for another article.

I was recently part of two large software development projects were performance was addressed to late, and the costs and time necessary fixing it was a magnitude larger as it would have to address performance early in the project. Not to mention the bad reputation the teams and systems got after going live with such a bad performance that users could hardly do their daily work with the system.

Besides knowing before you go live that users are not going to be happy (and therefore should NOT go live) there is another big advantage of early performance testing. If you aren't looking at performance until late in the project cycle, you have lost an incredible amount of information as to when performance changed. If performance is going to be an important architectural and design criterion, then performance testing should begin as soon as possible. If you are using an Agile methodology based on two-week iterations, I'd say performance testing should be included in the process no later than the third iteration.

Why is this so important? The biggest reason is that at the very least you know the kinds of changes that made performance fall off a cliff. Instead of having to think about the entire architecture when you encounter performance problems, you can focus on the most recent changes.

Doing performance testing early and often provides you with a narrow range of changes on which to focus. In early testing, you may not even try to diagnose performance, but you do have a baseline of performance figures to work from. This trend data provides vital information in diagnosing the source of performance issues and resolving them.

This approach also allows for the architectural and design choices to be validated against the actual performance requirements. Particularly for systems with hard performance requirements, early validation is crucial to delivering the system in a timely fashion.

“Fast” Is Not a Requirement

"Fast" is not a requirement. Neither is "responsive". Nor "extensible". The main reason why not is that you have no objective way to tell if they're met.

Some simple questions to ask: How many? In what period? How often? How soon? Increasing or decreasing? At what rate? If these questions cannot be answered then the need is not understood. The answers should be in the business case for the system and if they are not, then some hard thinking needs to be done. If you work as an architect and the business hasn't (or won't) tell you these numbers ask yourself why not. Then go get them. The next time someone tells you that a system needs to be "scalable" ask them where new users are going to come from and why. Ask how many and by when? Reject "lots" and "soon" as answers.

Uncertain quantitative criteria must be given as a range: the least, the nominal, and the most. If this range cannot be given, then the required behavior is not understood. As an architecture unfolds it can be checked against these criteria to see if it is (still) in tolerance. As the performance against some criteria drifts over time, valuable feedback is obtained. Finding these ranges and checking against them is a time-consuming and expensive business.

If no one cares enough about the system being "performant" (neither a requirement nor a word) to pay for performance tests, then more than likely performance doesn't matter. quote

You are then free to focus your efforts on aspects of the system that are worth paying for.

Automated Performance Testing

In order to keep costs and spend time on performance testing in check, I advise you to automate this as much as possible. Tools like Taurus simplifies the automation of performance testing, is built for developers and DevOps, and relies on JMeter, Selenium, Gatling and Grinder as underlying engines. It also enables parallel testing, its configuration format is readable and can be parsed by your version control system, it’s tool friendly and tests can be expressed using YAML or JSON.

Here are some types of tests you can run automated:

> Load Tests are conducted to understand the behavior of the system under a specific expected load.

> Stress Tests are used to understand the upper limits of capacity within the system.

> Soak Tests determine if the system can sustain the continuous expected load.

> Spike Tests determine if the system can sustain a suddenly increasing load generated by a large number of users.

> Isolation Tests determine if a previously detected system issue has been fixed by repeating a test execution that resulted in a system problem.

Closing Thoughts

Technical testing is notoriously difficult to get going. Setting up the appropriate environments, generating the proper data sets, and defining the necessary test cases all take a lot of time. By addressing performance testing early you can establish your test environment incrementally avoiding much more expensive efforts once after you discover performance issues.

In a nutshell: It's never too early to think about performance.

Thursday, March 08, 2018

Labels: Software Development

12 Benefits of Test Automation

1. Saves time and money

Software tests have to be repeated often during development cycles to ensure quality. Every time source code is modified software tests should be repeated. For each release of the software, it may be tested on all supported operating systems and hardware configurations. Manually repeating these tests is costly and time-consuming. Once created, automated tests can be run over and over again at no additional cost and they are much faster than manual tests. Test automation can reduce the time to run repetitive tests from days to hours. A time saving that translates directly into cost savings. It is of no surprise that, while the initial investment may be on the higher side, automated testing saves companies a lot in the long run. It contributes to a higher quality of work, thereby decreasing the necessity for fixing glitches after release and reduces project costs.

2. Improves accuracy

Even the most conscientious tester will make mistakes during monotonous manual testing. Automated tests perform the same steps precisely every time they are executed and never forget to record detailed results. Developers freed from repetitive manual tests have more time to create new automated tests and deal with complex features

3. Automation does what manual testing cannot

Even the largest software development and QA departments cannot perform a controlled web application test with thousands of users. Automated testing can simulate tens, hundreds or thousands of virtual users interacting with a network, devices, software and web applications.

4. Increases your test coverage

Test automation and automated tests can increase the depth and scope of tests to help improve software quality. Lengthy tests that are often avoided during manual testing can be run unattended. They can even be run on multiple computers with different configurations. Automated software testing can look inside an application and see memory contents, data tables, file contents, and internal program states to determine if the product is behaving as expected. Test automation can easily execute thousands of different complex test cases during every test run providing coverage that is impossible with manual tests.

5. Run tests 24/7

No matter where you are in the world. You can start the tests when you leave the office and when you get back in the morning you can see the results and keep on working. You can even do that remotely if you don't have a lot of devices or you don't have the possibility to buy them.

6. Faster feedback

Test automation comes as a relief for validation during various phases of a software project. This improves communication among coders, designers and product owners, and allows potential glitches to be immediately rectified. Test automation assures higher efficiency of the development team. Besides that the earlier a defect is identified, the more cost-effective it is to fix the glitch.

7. Testing efficiency improvement

Testing takes up a significant portion of the overall application development lifecycle. This goes to show that even the slightest improvement of the overall efficiency can make an enormous difference to the overall timeframe of the project. Although the setup time takes longer initially, automated tests eventually take up a significantly lesser amount of time. They can be run virtually unattended, leaving the results to be monitored towards the end of the process.

8. Reusability of automated tests

Due to the repetitive nature of test automation test cases, in addition to the relatively easy configuration of their setup, software developers have the opportunity to assess program reaction. Automated test cases are reusable and can hence be utilized through different approaches.

9. Thoroughness in testing

Testers tend to have different testing approaches, and their focus areas could vary due to their exposure and expertise. With the inclusion of automation, there is a guaranteed focus on all areas of testing, thereby assuring best possible quality.

10. Faster time-to-market

Test automation greatly helps reduce the time-to-market of an application by allowing continuous delivery. See Test Automation and Automated Tests.

11. Improved information security

The effectiveness of testing will be largely dependent on the quality of the test data you use. Manually creating quality test data takes time and as a result, testing is often performed on copies of live databases. Automation solutions can help with creating, manipulating and protecting your test database, allowing you to re-use your data time and again.

12. Ability to do full regression testing each deployment

With short cycles, manual regression testing is nearly impossible. Iterative and incremental development implies that code is not frozen at the end of the iteration but instead has the potential to be changed each iteration. Therefore, manual regression testing would mean rerunning most of the manual test–every iteration. Automating the tests therefore pays back quickly.

Wednesday, March 07, 2018

Labels: Software Development

Test Automation and Automated Tests

Because of my work as a project recovery consultant, I talk a lot about testing, continuous integration, continuous delivery, continuous deployment, and DevOps. During these conversations, the term automation gets thrown around in abundance.

In general, we all understand what automation means - the use of some technology to complete a task that was done manually before. But when we talk about automation in terms of testing, there are some nuances that we have to make.

Two types of automation

In the world of testing in general, and continuous delivery in particular, there are two types of automation:

1) Automated Tests
2) Test Automation

While it might just seem like two different ways to say the same thing, these terms actually have very different meanings.

Automated tests are tests that can be run automated, often developed in a programming language. In this case, we talk about the individual test cases, either unit-tests, integration/service, performance tests, end-2-end tests or acceptance tests.

Test automation is a broader concept and includes automated tests. From my perspective, it should be about the full automation of test cycles from check-in up-to deployment. Also called continuous testing.

Some linguistic analysis might also help clarify:

Automated Tests - in this case, "Tests" is a noun; the test is a thing. "Automated Tests" are particular types of tests; ones whose execution has been automated via some kind of code so that a person does not have to manually execute the test. You can have 1, 2, 3.... hundreds of automated tests.

Test Automation - this can be a noun as in the subject of "Automation" with "Test" indicating the type of automation. But, it is also referring to an activity; the activity of automating a given test or set of tests. As said previously; "Test Automation" is a broader concept than "Automated Tests".

Both automated testing and test automation are important to continuous delivery, but it's really the latter that makes continuous delivery of a high quality even possible.

Essential for continuous delivery

In a traditional environment, testing gets completed at the end of a development cycle. But as more and more companies move toward a DevOps and continuous delivery model in which software is constantly in development and must always be deployment-ready, leaving testing until the end no longer works. That's where continuous testing comes in - to ensure quality at every stage of development.

While ensuring quality at all times is very important, it's not all that counts. The speed at which all of the integration and testing occurs is very important as well. That's because if something in the pipeline stalls or breaks down, it holds up everything else and slows down the release of new developments, which defeats the purpose of continuous delivery.

This "how" and "why" make organization, consistency and speed imperative to support a continuous delivery model, and that's where test automation can help.

How automated test and test automation help

Managing all of the testing needs in a continuous delivery environment is a massive undertaking - it requires a tremendous communication effort to keep track of which environments have deployed new code, when each piece needs testing and how those requirements integrate back into the moving process of continuously delivering software. A simple continuous delivery pipeline could look like this:

1) Continuous integration server picks-up changes in the source code
2) Starts running the unit-tests
3) Deploys (automated) to an integration environment
4) Runs automated integration tests
5) Deploys (automated) to an acceptance environment
6) Runs automated acceptance tests
7) Deploys (automated or manual) to production

This is a combination of making automated tests by developers or testing engineers and DevOps people automating deployment. In doing so, test automation goes a long way toward helping ensure that teams maintain a high standard of quality at all points along the pipeline.

Wednesday, January 03, 2018

Labels: Software Development

The Only Test Plan You Will Ever Need

Assuming an iteration between two and four weeks:

1) Programmers will write unit tests in the code to ensure product technically behaves. The team will perform QA activities to ensure these are valid tests.

2) In the iteration planning meeting and in the iteration itself, tests cases will be defined as acceptance criteria for each product backlog item/requirement.

3) Acceptance criteria will be implemented as automated tests.

4) Every code check-in on the release branch will trigger all automated (unit and functional) tests to run as to perform full automated regression tests.

5) Tests will be implemented in the iteration where the requirement is being built.

6) Tools for performance, load and security tests will run at the end of each iteration.

7) Manual exploratory testing will be done at the end of an iteration in an attempt to identify missed tested cases.

8) At the iteration review meeting, stakeholders will verify that the solution is what they envisioned and will give feedback for the next iteration(s).

9) After deployment into production, you can implement A/B testing and review your product analytics to look at how users actually use your new functionality.

Sunday, August 20, 2017

Labels: Software Development

14 Essential Software Engineering Practices for Your Agile Project

Agile initiatives are found in almost any company that build software. Many of these initiatives have not brought the results that were expected. Nor are the people much happier in their work. This has many reasons, and a quick search on Google or LinkedIn will give you a plethora of them. But the one that I am confronted with almost every project I work on is the lack of experience with modern software engineering practices.

Flexibility (agility) can only happen when you are able to make changes to your product in an easy, fast, and flexible way. That is why:

Organizational Agility is constrained by Technical Agility

In other words, when you are slow in making changes to your product, then it doesn’t matter how you structure your teams, your organization or what framework you adopt, you will be slow to respond to changes. These changes include bug fixes, performance issues, feature requests, changed user behavior, new business opportunities, pivots, etc.

Luckily, there are a set of software engineering practices that will help your team to deliver a high-quality product in a flexible state. These practices are not limited to building software, they cover the whole process until actual delivery of working software to the end user.

Unfortunately, many software engineers I have met in the financial service industry have had very limited exposure to these practices and their supporting tools. And don't get me started on the newly born army of agile coaches and Scrum Masters that have not produced a single line of working code in their life. Besides hindering agility, it frustrates the teams because without these practices it is almost impossible to deliver working software in high quality within one Sprint.

Your organization will need to invest in these software engineering practices by training, infrastructure, tools, coaching and continuous improvement when you want your agile initiatives to actually able to deliver agility.

1. Unit Testing

The purpose of unit testing is not for finding bugs. It is a specification for the expected behaviors of the code under test. The code under test is the implementation for those expected behaviors. So unit test and the code under test are used to check the correctness of each other and protect each other. Later when someone changed the code under test, and it changed the behavior that is expected by the original author, the test will fail. If you code is covered by a reasonable amount of unit tests, you can maintain the code without breaking the existing feature. That’s why Michael Feathers define legacy code in his book as code without unit tests. Without unit tests your refactoring efforts will be a major risk every time you do it. 

2. Continuous Integration

3. Collective Code Ownership

Collective Ownership encourages everyone to contribute new ideas to all segments of the project. Any developer can change any line of code to add functionality, fix bugs, improve designs or refactor. No one person becomes a bottle neck for changes. This is easy to do when you have all your code covered with unit tests and automated acceptance tests. 

4. Refactoring

5. Test Driven Development

Test-driven development is a development style that drives the design by tests developed in short cycles of:   

1. Write one test, 

2. Implement just enough code to make it pass,

 3. Refactor the code so it is clean. 

 Ward Cunningham argues that test-first coding is not testing. Test-first coding is not new. It is nearly as old as programming. It is an analysis technique. We decide what we are programming and what we are not programming, and we decide what answers we expect. Test-first is also a design technique. 

6. Automated Acceptance Testing

Also known as Specification by Example. Specification by Example or Acceptance test-driven development (A-TDD) is a collaborative requirements discovery approach where examples and automatable tests are used for specifying requirements—creating executable specifications. These are created with the team, Product Owner, and other stakeholders in requirements workshops. I have written about a successful implementation of this technique within Actuarial Modeling. 

7. Chaos Engineering

Even when all of the individual services in a distributed system are functioning properly, the interactions between those services can cause unpredictable outcomes. Unpredictable outcomes, compounded by rare but disruptive real-world events that affect production environments, make these distributed systems inherently chaotic.

You need to identify weaknesses before they manifest in system-wide, aberrant behaviors. Systemic weaknesses could take the form of: improper fallback settings when a service is unavailable; retry storms from improperly tuned timeouts; outages when a downstream dependency receives too much traffic; cascading failures when a single point of failure crashes; etc. You must address the most significant weaknesses proactively before they affect our customers in production. You need a way to manage the chaos inherent in these systems, take advantage of increased flexibility and velocity, and have confidence in your production deployments despite the complexity that they represent.

An empirical, systems-based approach addresses the chaos in distributed systems at scale and builds confidence in the ability of those systems to withstand realistic conditions. We learn about the behavior of a distributed system by observing it during a controlled experiment. This is called Chaos Engineering. A good example of this would be the Chaos Monkey of Netflix.

8. Continuous Deployment

9. Micro Services

The micro services architecture runs a set of small services in a single application. These services are independent of each other and communication between these services is by means of a well-defined interface that uses a lightweight mechanism, for example, a REST API. Each service has a single function which matches micro services with business needs. There are different frameworks or programming languages that can be used to write micro services and they can also be set to either function as a single or group of services. 

10. Infrastructure as Code

Code and software development techniques like version control and continuous integration are used to merge and provision infrastructure under this practice. The interaction with infrastructure is programmer based and at scale rather than a manual setup and configuration resource. The API-driven model of its cloud makes it possible for system administrators and developers to interact with the infrastructure as such. Code-based tools are used by engineers to interface with infrastructure; hence it is treated like an application code. There being code based makes it possible for infrastructure and servers to be quickly deployed, using fixed standards, also the latest patches and versions can either be updated or repetitively duplicated. 

11. Configuration Management

The operating system, host configuration, operational tasks etc. are automated with codes by developers and system administrators. As codes are used, configuration changes become standard and repeatable. This relieves developers and system administrators of the burden of configuring the operating system, system applications or server software manually. 

12. Policy as Code

The configuration of infrastructure and infrastructure itself are codified with the cloud. This makes it possible for organizations to dynamically monitor and enforce compliance. It enables the automatic tracking, validation, and reconfiguration of infrastructure. In that way, organizations can easily control changes over resources and security measures are properly and distributively enforced. The fact that resources that do not comply can be flagged automatically for further investigation or automatically restarted to comply, increases the speed level of teams within an organization. 

13. Monitoring and Logging

14. Communication and Collaboration

This is the key feature of any Agile and/or DevOps model; as development, test, security, and operations come together and share their responsibilities, team spirit is enhanced and communication skills are facilitated. Chat applications, project tracking systems, and wikis are some of the tools that can be used by not just developers and operations but also other teams like marketing or sales. This brings all parts of the organization closely together as they cooperate to see to the realization of goals and projects.

Monday, August 07, 2017

Labels: Software Development

Would Your Team Work With the Chaos Monkey?

Advances in large-scale, distributed software systems are changing the game for software engineering. As an industry, we are quick to adopt practices that increase flexibility of development and velocity of deployment. An urgent question follows on the heels of these benefits: How much confidence we can have in the complex systems that we put into production?

Chaos Engineering

We need to identify weaknesses before they manifest in system-wide, aberrant behaviors. Systemic weaknesses could take the form of: improper fallback settings when a service is unavailable; retry storms from improperly tuned timeouts; outages when a downstream dependency receives too much traffic; cascading failures when a single point of failure crashes; etc. We must address the most significant weaknesses proactively, before they affect our customers in production. We need a way to manage the chaos inherent in these systems, take advantage of increasing flexibility and velocity, and have confidence in our production deployments despite the complexity that they represent.

Chaos Monkey

Chaos Engineering was the philosophy when Netflix built Chaos Monkey, a tool that randomly disables Amazon Web Services (AWS) production instances to make sure you can survive this common type of failure without any customer impact. The name comes from the idea of unleashing a wild monkey with a weapon in your data center (or cloud region) to randomly shoot down instances and chew through cables — all the while you continue serving your customers without interruption.

By running Chaos Monkey in the middle of a business day, in a carefully monitored environment with engineers standing by to address any problems, you can still learn the lessons about the weaknesses of your system, and build automatic recovery mechanisms to deal with them. So next time an instance fails at 3 am on a Sunday, you won’t even notice.

Chaos Monkey has a configurable schedule that allows simulated failures to occur at times when they can be closely monitored. In this way, it’s possible to prepare for major unexpected errors rather than just waiting for catastrophe to strike and seeing how well you can manage.

Chaos Monkey was the original member of Netflix’s Simian Army, a collection of software tools designed to test the AWS infrastructure. The software is open source (GitHub) to allow other cloud services users to adapt it for their use. Other Simian Army members have been added to create failures and check for abnormal conditions, configurations and security issues.

An Agile and DevOps engineering culture doesn’t have a mechanism to force engineers to architect their code in any specific way. Instead, you can build strong alignment around resiliency by taking the pain of disappearing servers and bringing that pain forward.

Most people think this is a crazy idea, but you can’t depend on the infrequent occurrence of outages to impact behavior. Knowing that this will happen on a frequent basis creates strong alignment among engineers to build in the redundancy and automation to survive this type of incident without any impact on your customers.

Would your team be willing to implement their own Chaos Monkey?

Thursday, July 13, 2017

Labels: Software Development

No More User Stories! There Are Jobs to Be Done

Creating new and better products that thousands or millions of customers actually love is a top priority for almost every company. Agile frameworks and techniques have been hailed as a solution for exactly this problem. It started with eXtreme Programming (XP), where a conversation between the development team and the customer is the base for all requirements.

And in theory this is the way to go, right? But there are two issues with this line of thinking. The first is that it does not work for start-ups and new products were it is not clear yet who the customer is - the Lean Startup Methodology from Eric Ries is based around this fact. And the second issue is something Clayton Christensen has discovered.

Clayton Christensen, the famed Harvard Business School Professor known for coining the term “disruptive innovation”, believed that one of his most enduring legacies will be an idea he first put forward in his 2003 book "The Innovator’s Solution": don’t sell products and services to customers, but rather try to help people address their jobs to be done. This seemingly simple idea has profound implications for re-framing industries and products.

Never have companies known more about their customers. Thanks to the big data revolution, companies now can collect an enormous variety and volume of customer information, at unprecedented speed, and perform sophisticated analyses of it. Many firms have established structured, disciplined innovation processes and brought in highly skilled talent to run them. Direct access to customers is not a problem, and customers are more then willing to voice there opinion on social media or interviews.

So why is it still so difficult to get products right, and actually have a market for them?

The fundamental problem is, most of the masses of customer data companies create is structured to show correlations: This customer looks like that one, or 68% of customers say they prefer version A to version B. While it’s exciting to find patterns in the numbers, they don’t mean that one thing actually caused another. And though it’s no surprise that correlation isn’t causality, Clayton Christensen and his team suspect that most managers have grown comfortable basing decisions on correlations.

Why is this misguided? Consider the case of myself. I am 36 years old. I am 1.74m tall. My shoe size is 43. Me and my wife have one child. I drive a Volvo CX60, and go to work by train. I have a lot of characteristics, but none of them has caused me to go out and buy my new Salomon trail running shoes. My reasons for buying these shoes are much more specific. I buy it because I need shoes to run in the mountains and my previous pair were not giving enough support anymore after running too many kilometers on them. Marketers who collect demographic or psychographic information about me—and look for correlations with other buyer segments—are not going to capture those reasons.

After decades of watching great companies fail, Christensen and his team have come to the conclusion that the focus on correlation—and on knowing more and more about customers—is taking companies in the wrong direction. What they really need to home in on is the progress that the customer is trying to make in a given circumstance—what the customer hopes to accomplish. This is what he has come to call the job to be done.

We all have many jobs to be done in our lives. Some are little (pass the time while waiting in line); some are big (find a more fulfilling career). Some surface unpredictably (dress for an out-of-town business meeting after the airline lost my suitcase); some regularly (pack a healthful lunch for my son to take to school). When we buy a product, we essentially “hire” it to help us do a job. If it does the job well, the next time we’re confronted with the same job, we tend to hire that product again. And if it does a crummy job, we “fire” it and look for an alternative. (I am using the word “product” here as shorthand for any solution that companies can sell; of course, the full set of “candidates” we consider hiring can often go well beyond just offerings from companies.)

Jobs-to-be-done theory transforms our understanding of customer choice in a way that no amount of data ever could because it gets at the causal driver behind a purchase.

Related articles:
> Stop Wasting Money on FOMO Technology Innovation Projects
> Project ≠ Product ≠ Business ≠ Company

If you didn't have the opportunity to listen Prof. Clayton Christensen presenting the jobs to be done concept, or to read his latest book "Competing Against Luck", please invest a few minutes in listening to his milkshake example story.

So what has this to do with user stories?

In the Agile world we make the same mistake! We focus on persona and/or roles. Instead of the job that needs to be done. Another problem with user stories is that it contains too many assumptions and doesn’t acknowledge causality. When a task is put in the format of a user story ( As a [persona/role], I want to [action], so that [outcome/benefit] ) there’s no room to ask ‘why’ — you’re essentially locked into a particular sequence with no context. Alan Klement from https://jtbd.info/ puts it very nicely in an Image.

Job stories slightly revise the format to be less prescriptive of a user action, and thereby give more meaningful information for the designer and developer to build for the user’s expected outcome.

Here are a few examples of the same feature story in user story and job story format:

User Story: As a subject matter expert, I want a badge on my profile that when I am a top poster, so people know I am a top poster.

Job Story: When I am one of the top posters for a topic I want it to show on my profile so that people know that I am an expert in a subject.

User Story: As a visitor, I want to see the number of posts in a topic for another user, so I can see where their experience lies.

Job Story: When I visit someones profile page I want to see how many posts that they have in each topic so that I have an understanding of where they have the most knowledge.

User Story: As a tax specialist that has used the application multiple times, I should get an alert to contribute

Job Story: When I have used the application multiple times I get nudged to contribute so that I am encouraged to contribute.

There is a very slight but meaningful difference between the two. By removing “As a __ ” from the user story, we remove any sort of biases that the team might have for that persona. Personas create assumptions about the users that might not be validated.

In my experience, user stories have a tendency to be easily manipulated to proposing a solution rather than explaining an expected outcome for that particular user. In particular, I’ve found people leave off the “so that __” in a user story with the feeling that it is optional. This leaves off the benefit that the user would get from adding new functionality.

In a job story we replace “As a __ ” with “When __ ”. This gives the team more context for the user’s situation and allows us to share his or her viewpoint. Next, the “I want to __” is transformed into situational motivation in the job story, as opposed to a prescriptive solution for a persona in the user story.

Because the differences in wording are negligible, it is an easy transition to shift from writing user stories to job stories. By placing the user’s situation upfront, you have a better understanding how it feels to be in the user’s shoes, as opposed to thinking about a particular persona. This allowed for more discussion of the expected outcome and how to best go about achieving said outcome for the user. Which in the end will result in a better product.

In a nutshell: Jobs-to-be-done theory transforms our understanding of customer choice in a way that no amount of data ever could because it gets at the causal driver behind a purchase.

Sunday, July 02, 2017

Labels: Project Management, Software Development

Kanban - Card, Board, System or Method?

Kanban is a term that can mean many things. Two people talking about Kanban usually have different levels of understanding and thinking about it. Is it a card, a board, a manufacturing system or a software development method? The short answer is – Kanban is all of them. Kanban means literally both card and board. The word ‘kanban’ has its origin in both Hiragana (Japanese language) and Kanji (Chinese language). In Hiragana, it means a ‘signal card’ while is Kanji it means a ‘sign’ or ‘large visual board’.

Kanban System

Beyond the etymology, ‘Kanban’ as a concept was popularized by Toyota in the 40’s who took inspiration from how supermarkets stock their shelves and promoted the idea of Just-in-Time manufacturing – using ‘Kanban Cards‘ as a signal between two dependent processes to facilitate smoother – and just in time – flow of parts between them. With time, the idea of Kanban evolved to be more than just a signal card. First in the manufacturing world, and now in IT industry, a ‘Kanban System’ is characterized by two key features:

1. Visualization of work items – using signal cards, or some other means.
2. A pull-based system, where work is pulled by the next process, based on available capacity, rather than pushed by the previous process.

A team that uses Kanban System to track and manage the flow of work may often use a board to visualize the items that are in progress. Such a board is called ‘Kanban Board’. Those practicing Scrum may think of a Scrum board as a simplified version of a Kanban Board. See an example of such a board below.

Kanban Method

‘Kanban Method’ is a term coined and popularized by David J Anderson who, over the past ten years, has evolved the Kanban concept into a management method to improve service delivery and evolve the business to be ‘fit for purpose’. It is not project management method or a process framework that tells how to develop software, but is a set of principles and practices that help you pursue incremental, evolutionary change in your organization.

In other words, it will not replace your existing process, but evolve it to be a better ‘fit for purpose’ – be it Scrum or waterfall. The six key practices outlined in the Kanban Method include:

1. Visualize your work
2. Limit work-in-progress
3. Measure and manage flow
4. Make policies explicit
5. Implement feedback loops
6. Improve collaboratively, evolve experimentally

While the idea of Kanban has evolved from a signal card to a management method, its emphasis on visualization and pull-based work management have remained intact.