Thursday, May 16, 2019

Project managers deliver projects; project sponsors deliver business value!

Project managers deliver projects; project sponsors deliver business value!
As a project sponsor, you are ultimately accountable to the organization for the delivery of the business outcomes and benefits. The project team and steering committee exist to help you deliver the outcomes and realize the benefits.

You can take a passive role — attending your steering committee meetings, reviewing progress reports, meeting with the project manager — and not bother with the details or even really know what exactly is going on with your project.

Alternatively, you can take the project sponsor role seriously, and actively work to ensure the success of your project. For active project sponsors, here are some guiding principles:

1) Remember you’re not accountable for bringing in the project on time and on budget. That’s the project manager’s job. You’re accountable for delivering the business outcomes, benefits, and (net) value.

2) Project outcomes are not business outcomes. A project may deliver a system, but a system on its own doesn’t equal business value. The business wants to use the system to improve how it does business, competes, and makes money. This is a different outcome, one that delivers real business value. Look at what your project investment is going to deliver — is it a project outcome that the business hopes it will make work, or a business outcome that will deliver real, measurable business value?

3) Your project is set up to implement change, whereas the business is set up not to change. For the project to be successful it must interrupt business-as-usual and change it. To achieve this, it needs the help, support, and authority of you and your steering committee. No business change = no business benefits = you fail as project sponsor.

4) Who you have on your project determines the outcomes and results. Two project teams given the same problem will produce two different solutions. Choose your project team wisely and well. It is worth putting significant time into the project resourcing step. This is especially true for choosing your project manager.

5) You can’t focus on everything, so you should focus on three things in particular:

i) The project RAID lists (Risks, Assumptions, Issues and Decisions) : Unresolved issues that need to be resolved before implementation are an unknown, unplanned workload (and cost). If the issues log explodes, so can your project timeline and budget. And risks are the threats to the project and its successful delivery. A series of new risks late in the project can threaten the viability of the entire project.

ii) Your critical path (or chain): How you are tracking to this path will determine your likelihood of an on-time and on-budget delivery.

iii) Your project’s value: Value creep occurs when the benefits of a project progressively go down while the costs increase. The result is a net reduction in project value, often resulting in a move from positive to negative returns.

6) Be present. Visit the project team at least once a month. Visit the business areas being impacted by your project — what do they think about the project? Visit your key stakeholders regularly — are they still supportive? You need to be seen to be leading, committed and involved. If you are losing business support, you want to be the first to know so you can take action before it is too late.

7) Learn to do your job as a sponsor. Project sponsorship is not intuitive or a natural extension of line or operational management. It is a different set of skills and knowledge base, one that has to be learned. You wouldn’t want amateurs on your project, so why impose one at the top?

8) Watch for signs of trouble — the little changes that sneak up on you. These include unplanned employee turnover that loses cumulative project knowledge, cumulative scope changes that redirect the whole project, and poor quality outputs that indicate the project may be in trouble. With your broader perspective, you need to review these leading indicators of project failure.

Closing thoughts

A great executive sponsor is an active sponsor. It will take time and effort, but if your project is not worth doing this, then why do the project?

Read more…

Monday, May 13, 2019

How value creep is killing your project

How value creep is killing your project
As a project sponsor or steering committee member you are probably familiar with scope creep. Sometimes known as “requirement creep” or even “feature creep,” the term refers to how a project’s requirements tend to increase over the project lifecycle. For example, what once started out as a single deliverable becomes five, or a product that began with three essential features now must have ten.

Scope creep is typically caused by key project stakeholders (like yourselves) changing requirements, or sometimes from internal miscommunication and disagreements. But while scope creep is a problem for many projects, it is nothing compared to the far more devastating value creep.

Value creep is when the benefits of a project progressively go down while the costs increase. The result is a net reduction in project value, often resulting in a move from positive to negative returns.

And it’s pervasive. In fact, most projects are beset by scope changes, unforeseen events, and time and cost overruns that represent this value creep.

I have sat in numerous steering committee meetings and listened as decisions are made, usually on the recommendation of the project manager, that progressively reduce the value of the project.

For example, one project sponsor stated that one of his main goals was to ensure the new system was based on a platform that is industry standard, and much used in other industries as well. The reason behind this was to prevent having trouble finding skilled employees, as had been the case with the system it needed to replace.

He then went into his steering committee meeting and immediately acceded to his project manager’s statement that the new system should be based on a platform that was already in use in the organization. It was even harder to find skilled employees for this platform than for the system it needed to replace, but somehow this was ignored.

Not surprisingly, the project costs exploded, and it failed to deliver the benefits expected. The project manager didn’t mind; he had brought the project in on time and to (his) specification. The organization then had to put up with an ill-fitting solution for years.

As a sponsor or steering committee member you need to always be conscious of value creep. These decisions—often made piece-meal over time—cumulatively increase the cost and decrease the value.

The graph below (click to enlarge) visualizes a 10-month project that is fictional but is similar to real live projects I have witnessed. The project starts with a clear value proposition: $9M benefits and $4M costs.


For the first two months, everybody is convinced it will stay like this, and then a part is descoped to save costs and keep the project within budget. This reduces the benefits by $2M. Meanwhile, the costs start to go up (as it is with most technology projects). After five months it becomes clear that the system cannot automate a number of things that had been assumed/promised without putting in an additional two months of work. The sponsor and steering committee want to keep the timeline, so they decide against the extra work. Boom, another $2M reduction in benefits. And from this point on, the project has an actual negative value.

Loss of benefits is usually a far greater long-term loss than a (reasonable) cost overrun. One way of fighting value creep is to constantly focus on protecting the value, refusing to compromise or harm the project’s value proposition.

To do this you need to understand which parts of your project deliver the value; otherwise, you won’t know what value dimensions you are dealing with when you have to make decisions.

A good to ask yourself is: Do you know how each major element of promised business value is going to be delivered on your current project? If not, you’ve got some work to do.

Once you know the answer to this question, you’re one step ahead in eliminating value creep. While delivery costs on a project may rise, if you keep your focus on maintaining your project’s value, you will deliver business value in the end.

Read more…

Tuesday, April 30, 2019

Decision-making problems

Decision-making problems
When I consult with executives, I sometimes start with this very simple exercise. I ask group members to come to our first meeting with a brief description of their best and worst decisions of the previous year.

I have yet to come across someone who doesn’t identify their best and worst results rather than their best and worst decisions. This drawing of an overly tight relationship between results and decision quality affects our decisions every day, potentially with far-reaching, catastrophic consequences. Poker players even have a word for this: “resulting.”

While good decisions can have a bad outcome, bad decisions have a far higher probability of a bad outcome than good decisions. You can compare decision-making when you don’t have all the facts to making a bet.

According to professional poker player Annie Duke, “Thinking in bets starts with recognizing that there are exactly two things that determine how our lives turn out: the quality of our decisions and luck. Learning to recognize the difference between the two is what thinking in bets is all about.”

Poker is a game of incomplete information. It is a game of decision-making under conditions of uncertainty over time. In poker, valuable information remains hidden. There is also an element of luck in any outcome. You could make the best possible decision at every point based on the information you have and still lose the hand, because you don’t know what new cards will be dealt and revealed. Once the game is finished, you can try to learn from the results. But separating the quality of your decisions from the influence of luck is difficult.

In chess, outcomes correlate more tightly with decision quality. In poker, it is much easier to get lucky and win, or get unlucky and lose. If life were like chess, nearly every time you ran a red light you would get in an accident (or at least receive a ticket).

So now that we have established that we need luck and quality decision-making to get our project to succeed, let's focus on the latter.

Is it a decision or is it a problem?

One of the first decision-making problems you face—often without realizing it—is to decide whether you have a problem to solve or a decision to make.

Time can be wasted and people frustrated if you resort to setting up a problem-solving team when really a decision simply needed to be made.

Alternatively, living with a decision that was made when it wasn’t clear why something had gone wrong (that is, you had a problem to solve first before you could make a decision) can be just as costly.

Decision-making problems often arise because you aren’t clear whether you have a problem to solve or a decision to make.

Avoiding this problem can be easy. A simple approach is to determine whether there is something wrong, or something you are dissatisfied with that you know needs to change. If there is, and you know why something is wrong and there are clear approaches to take or alternatives to choose, then you have a decision to make. You can look forward and act. You make a bet.

It is only when you have a situation where it is not clear what should be done, that you then have a problem to solve. In this case, you must first work on understanding the problem and define potential alternatives or approaches to solve the problem. Then, you must make a decision based on these alternatives.

So now that we know when to make a decision and when to solve a problem, let’s have a look at some other reasons why decision-making is done badly so often.

Common decision-making mistakes

Below are a number of observations I have made in the last few years regarding decision-making at steering committees and executive management meetings.

> Key decisions (e.g., strategic, structural or architectural) are made by people who lack the subject-matter expertise to be making the decision.

> Expert advice is either ignored or simply never solicited.

> Lack of “situational awareness” results in ineffective decisions being made.

> Failure to bring closure to a critical decision results in wheel-spinning and inaction over extended periods of time.

> Team avoids the difficult decisions because some stakeholders may be unhappy with the outcome.

> Group decisions are made at the lowest common denominator rather than facilitating group decision-making towards the best possible answer.

> Key decisions are made without identifying or considering alternatives. The first option wins.

> Decision fragments are left unanswered, resulting in confusion. In other words, parts of the who, why, when, where and how components of a decision are made, but others are never finalized. See “Many decisions are no decisions (and this makes projects difficult)”.

> Failure to establish clear ownership of decisions or the process by which key decisions will be made results in indecision and confusion.

Conclusion

Identifying that a decision needs to be made, rather than a problem solved, is the first important step in avoiding decision-making problems. But once it’s been determined that a decision needs to be made, things can still go wrong. All of the above observations can be attributed to three root causes of bad decision-making:

1) We don’t involve the key people who should be involved.

2) We don’t generate enough alternatives upon which to base our choice of decision.

3) We don’t follow recognized and proven decision-making processes.

So do the opposite. Involve the right people, evaluate alternatives, and follow a proven decision-making process.

What makes a decision great is not that it has a great outcome. A great decision is the result of a good process, and that process must include an attempt to accurately represent our own state of knowledge. That state of knowledge, in turn, is some variation of “I’m not sure.”

So just like in poker, base your bet on the information you have, and you just may end up with a winning hand.

Read more…

Thursday, April 25, 2019

Why your projects should be short and fat (and how to get them that way)

Why your projects should be short and fat (and how to get them that way)
Project portfolio management is not necessarily complex. The goals are clear and simple.

1) Maximizing the value of your portfolio

2) Seeking the right balance of projects (risk vs. reward, run vs. change, etc.)

3) Creating a strong link to your strategy

4) Doing the right number of projects

Achieving these goals, on the other hand, is not such an easy task.

If we cut to the core, project portfolio management is only about two things:

Overview and decisions.

It is not difficult to obtain an overview of your project portfolio. At least not the simple overview, which is often sufficient. The hard part is making the tough decisions.

The vast majority of projects are, in isolation, good business ideas, but it is just not possible to pursue them all at the same time. The capacity of your organization to do projects is limited. Thus, good decision-making requires turning down good projects.

Many organizations have realized that a good approach to this is aiming for a project portfolio of short and fat projects. Short and fat projects imply that the company runs a small number of short projects in parallel, armed with sufficient resources.

The alternative is running many long and thin projects concurrently, which means that the organization’s resources are spread insufficiently between many parallel projects that are having a hard time crossing the finishing line. Portfolios consisting of long and thin projects are what we find in most organizations.

The underlying concept is visualized in the diagram below.


Many organizations have nodded approvingly and bought into the logic of this diagram, favoring short and fat projects over long and thin. However, despite this general agreement, the principle of short and fat is only very rarely implemented. It seems as if we often acknowledge the logic of the principle, but do not perceive it to be sufficiently relevant to our own situation.

Why is that? In my experience, it is because of no responsibility on portfolio level, no attention for throughput, no bottleneck handling, unclear strategy, and a false belief in equality. Below, I’ll go into further detail about each of these factors and show how it can be shifted to support the short-and-fat project mindset in your organization. 

Assign responsibility

It is necessary that someone assumes responsibility at the portfolio level. Each project owner has his primary interest in succeeding with his own project and, at most, a secondary interest in his colleagues succeeding with their projects elsewhere in the organization. Which is, by the way, quite natural.

Carrying out successful projects is a difficult task, and project sponsors, project owners, project managers and key project participants must be passionate about the project and fight for it with blood, sweat and tears. That is their mission – and they need to have a somewhat single-minded focus on the project to succeed.

Only by giving a person or a group of persons responsibility and targets at portfolio level will it be possible to make objective decisions about projects. The task is still difficult. The game of resources and prioritization between projects is perhaps one of the most heated and important games in many organizations, and those responsible for the portfolio are placed in the middle of this game.

Pay attention to portfolio throughput 

You have to consider project portfolio throughput from a cost-of-delay point of view. You might, for example, be able to complete a project with perfect resource management (all staff are perfectly busy) in 12 months for $1 million.

Alternatively, you could decide to not do another project and assign those people to this project as well, so you can complete it in only six months. You could also hire some extra people, have them sitting around occasionally at a total cost of $1.5 million, and realize both projects in 6 months.

What is that six months’ difference worth? Well, if the project is strategic in nature, it could be worth everything. It could mean being first to market with a new product or possessing a required capability for an upcoming bid that you don't even know about yet. It could mean impressing the heck out of some skeptical new client or being prepared for an external audit. There are many scenarios where the benefits of completing a project quickly outweigh the cost savings of drawing it out.

In addition to delivering the project faster, when you are done after six months instead of 12 months you can use the existing team for a different project, delivering even more benefits for your organization. So not only do you get your benefits for your original project sooner and/or longer, you will get those for your next project sooner as well because it starts earlier and is staffed with an experienced team.

Prioritizing based on the effect of projects will result in better decisions.

Improve bottleneck handling

Spreading one’s scarce resources between too many projects is damaging to the bottom line – and to the scarce resources. The resources in question can be cash or machines, but usually the limiting factor is your key people.

Usually, you can easily identify 5–10 people who are the most sought-after by your organization’s project managers, and unfortunately, they’re typically allocated to too many projects at the same time.

Following the theory of constraints, these are the people who also determine the progress in your portfolio. It could be their decision-making, or it could be the way they’re prioritized. Either way, they are your bottlenecks.

The handling of these people, and how brave you are in your decisions regarding them, is thus an important factor in project portfolio management. Besides your key people not being effective because they have to juggle many projects at the same time, their motivation will drop because of it, making it even harder to get things done.

Protecting and deliberately assigning projects to your key people is essential.

Have a clear strategy

Mike Porter states in his influential book "Competitive Strategy" that an organization creates a sustainable competitive advantage over its rivals by "deliberately choosing a different set of activities to deliver unique value." Therefore, strategy requires making explicit choices.

Lafley and Martin define strategy in their book "Playing to Win: How Strategy Really Works" as an integrated set of choices that uniquely positions the organization (which can be a company, a department, or a business unit) in its industry so as to create sustainable advantage and superior value relative to the competition.

It is natural to want to keep options open as long as possible, rather than closing off possibilities by making explicit choices. However, it is only through making and acting on choices that you can win. Yes, clear, tough choices force your hand and confine you to a path. But they also free you to focus on what matters.

When you have no clear strategy, it is impossible to select the right projects for your portfolio to execute your strategy,

Stop believing in equality

The notion of equality is poisonous to an efficient project portfolio. It does not make any sense that all projects that are good ideas – or equally good ideas – should be treated the same and be allocated the same amount of resources or be initiated at the same time.

Similarly, it does not make any sense to attach the same weight to all organizational areas or project types in a portfolio based on a principle of equality and justice. As a whole, the organization will lose on this. Trade-offs have to be made, even though it hurts.

Conclusion

Remember the numbers two (number of concurrent projects per project participant) and five (maximum number of must-win battles).

Surveys of the efficiency of project members show – both logically and mathematically – that it is best for your project portfolio and its progress if all project members are only allocated to a maximum of two projects at the same time. This outcome is supported by those who are allocated to too many concurrent projects and whose time is inefficiently spread between these – just ask them.

Another number to remember is IMD Business School’s rule of thumb for how many significant strategic initiatives – the so-called must-win battles – a management team should launch at the same time. This number is five. And those five should be held on to until they are fully implemented.

When it comes to organizational development and strategic initiatives, the scarce factor includes intangibles such as the management team’s total amount of attention and the organization’s overall ability to change. These are difficult to sum-up in figures, and, thus, it is important to have the courage to go for the short and fat approach, even though it can’t always be proven mathematically.

Challenge yourself to not allocate project workers to more than two concurrent projects, and not initiate more than five strategic and important initiatives at the same time. It is an experiment worth trying.

Read more…

Wednesday, April 17, 2019

The biggest mistake project managers make with project cost management

The biggest mistake project managers make with project cost management
The biggest mistake project managers make with project cost management is not doing monthly forecasting and controlling.

Project cost management is nothing more than doing the following three things every month:

1) Cost Estimating
2) Cost Budgeting
3) Cost Controlling

You start your project with an approved budget. When you are lucky, you created the budget yourself based on a combination of bottom-up and top-down estimations and it got approved by the project sponsor. When you are not so lucky, you inherited a budget created by somebody else, or you just got less money than you budgeted.

The first step is taking this budget and dividing it into meaningful spending categories. The first split I always make is between internal costs and external costs. External costs are cash out, and they’re handled differently by your company than internal costs. The rest of my splits depend on project type and size, for example: project management, technology management, change management, training and travel costs. In theory this would give me a total of 10 (2 x 5) spending categories, but since travel costs are always external costs it would be effectively nine categories.

It’s important to verify that you can map your actual booked costs to these categories. This means, for internal costs, that people working on a project have different booking codes for the different categories, or you can map all the hours of one person to one category. For external costs it means that you have different booking codes for each category, or you map them manually.

Now that you have your categories, you just create a simple spreadsheet and create a simple table with a row for each category and one column for each month of the planned duration of your project. Before your project has started, you’ll fill each column with the forecasted costs per month per category. When your project has already been running for a while, you will place the actual booked costs per category for these months in the columns and the forecast values in the rest.

When you have done this, you can add the following six columns to the end of the table.

> Total Budget: Here you manually input the budget you have per category.

> Total Actuals: This is the sum of all booked costs per category.

> Total Forecast: This is the sum of all forecasted costs per category.

> Projected Costs: This is the sum of Total Actuals + Total Forecast per category.

> Budget – Actuals: This gives you the amount that you have left of your budget per category.

> Budget – Projected Costs: This gives you a good indication if your budget is enough to realize the project.

Now add one row to your table that just sums up all the rows above, so you have the information per category as well as on a project level.

This spreadsheet is all you need as a tool to start effective project cost management.

No matter your starting point and categories, from now on you have to do the following four things each and every month until the project is officially closed.

1) Get all the costs that are booked on your project from your finance team. Put these numbers in the month they are booked. Remember that these are booked later as they have occurred.

2) Add so-called accruals to your forecast. When Supplier A has worked 36 man-days on your project in April, then you need to add these as accruals to the forecast for May. When they are not booked, then you will add the accruals to the forecast from June. You do this until the final invoice is booked and the number shows up in your actual booked costs. This is an essential part of cost controlling. When you do not do this, you will always have around two months of external costs not in your overview and will be very surprised at the end of your project.

3) Update your forecast based on what you have learned this month about the project and update these numbers in the spreadsheet.

4) Review the new numbers in the last six columns and take the necessary actions based on them.

When it comes down to it, project cost management is nothing more than discipline by the project manager to break down the spending categories, create an effective spreadsheet, and do the cost estimating, cost budgeting, and cost controlling steps described above every single month, and the ability of the organization to produce the input needed. The better the quality of the input, the better the quality of your cost controlling.

Read more…

Tuesday, April 09, 2019

Case Study: How a screwed-up SAP implementation almost brought down National Grid

Case Study: How a screwed-up SAP implementation almost brought down National Grid
National Grid USA (NGUSA), which is part of the UK-based National Grid Ltd., supplies electricity and gas in Massachusetts, New York, and Rhode Island. It is one of the largest investor-owned power distribution companies in the U.S. In October 2012 the company faced a very difficult decision.

The SAP project that was three years in running and marred by delays and budget overruns was scheduled to go live on November 5. Failure to go live meant a delay of another five months, likely another $50 million in additional spending, and a trip back to the Utilities Rate Commission to request approval to pay for the overruns.

At the same time, Hurricane Sandy was pounding up the East Coast. By mid-October, forecasts for damage in NGUSA’s service area were extensive. For a utility company, power restoration takes precedence over everything else after a hurricane.

NGUSA had to know they had a bumpy ride coming when they made the decision to go live. What they clearly didn’t understand was just how bumpy the ride would be.

In the weeks that followed the go-live, while the NGUSA crews were working tirelessly to restore power, the SAP project team was just beginning to understand the full extent of the damage being caused because of the screwed-up SAP implementation. The problems spanned many areas, including:

Payroll: The new SAP system miscalculated time, pay rates, and reimbursements, so that employees were paid too little, too much, or nothing at all. Over-payments of more than $6 million were made to employees and not recovered. There was $12 million paid in settlements to employees related to short pays and deductions. Delays in the generation of W2s and other tax reporting occurred.

Vendor payments: Just two months after go-live, NGUSA had 15,000 vendor invoices that they were unable to process for payment, inventory record-keeping was in shambles, and vendors were being issued payments with the understanding that reconciliation would take place later.

Financial reporting: Prior to go-live it took NGUSA four days to close its financial books. Following go-live the close took 43 days. So bad was the financial reporting, NGUSA lost its ability to access short-term borrowing financial vehicles based upon its inability to provide satisfactory financial reports.

To deal with the many accounting, payroll, and supply-chain issues NGUSA was grappling with, they launched a stabilization program. To support the program, 300 contractors were initially brought in to assist with payroll issues. A total of 450 contractors were eventually brought in to address payroll problems. Another 200 contractors were brought in to assist with supply-chain issues. And, another 200 contractors were brought in to support the financial close issues.

The first priority of the stabilization effort was to ensure that NGUSA could comply with its obligations, including:

> Paying employees accurately and on time
> Paying vendors accurately and timely
> Providing legal, regulatory, and other reports to external stakeholders that are accurate and timely

The team’s second stabilization priority was to enable NGUSA to be efficient and self-sufficient in operating the SAP system and realize the benefits the system can provide without significant reliance on external support.

The continuing effort to stabilize SAP was anticipated to be about $30 million per month in September 2013.

The problems were so profound that the cleanup took more than two years to complete with a calculated cost of $585 million, more than 150 percent of the cost of implementation.

The journey

The journey to the decision to go live on November 5 was not unlike that experienced by other companies that have followed a similar path. In 2007, NGUSA had finalized a major acquisition, making it one of the largest privately held power distribution companies in the U.S.

This acquisition left the company with two sets of financial and operating systems. Capturing the synergies of combining these systems and adopting new sets of business processes were key components of the justification for the project. The project was also viewed as a method to allow NGUSA to address significant audit deficiencies in its financial business processes.

The project to upgrade NGUSA's legacy systems – many of which were running on Oracle – began in 2008. In mid-2009, NGUSA hired Deloitte as its systems integrator and set a project budget of $290 million that was submitted and approved by the Utilities Rate Commission.

Deloitte was initially employed as the lead implementation partner, project manager and systems integrator, but in June 2010 it was replaced by Ernst and Young (EY) in the first two roles, and by Wipro as systems integrator. The main reason for this switch was to lower implementation costs.

The program operated with a target go-live date of December 2011. This date was later moved to July 2012, followed by October 2012, and then a November 2012 target date. The final sanctioned estimate of the project was set at $383 million, nearly 30 percent beyond the original target budget that was approved by the board.

NGUSA continued to engage Wipro after the go live in making the necessary fixes to the installed SAP system tolling their agreement (extending the statute of limitations for filing suit). In many instances, functional and technical specifications had to be completely rewritten and entire SAP modules had to be rebuilt or abandoned.

On November 30, 2017, NGUSA filed a lawsuit against Wipro in the U.S. District Court Eastern District of New York. The lawsuit notes that NGUSA was unable to file suit against EY due to the language of their contract. The suit alleges that Wipro fraudulently induced NGUSA into signing the original agreements. NGUSA claims that Wipro misrepresented its SAP implementation capabilities, talent, and knowledge of the U.S. utilities business operations and common practices.

As Wipro knew or should have known, it had neither the ability nor intent to assign appropriately experienced and skilled consultants to the Project because... it in fact had virtually no experience implementing an SAP platform for a U.S.-regulated utility. – National Grid USA

Besides this, the suit alleges that Wipro breached its contract with NGUSA by:

> Failing to prepare design documents and specifications to industry standards.
Failing to prepare programming and configuration to industry standards.
Failing to adequately test, detect, and inform of problems.
Failing to advise that the system was not ready to go live.
Breaching express and implied warranties by not providing consultants that were consistent with a top 25 percent SAP implementation firm.
Negligently misrepresenting itself for the same reasons identified in the first cause for action.
Violating New York’s General Business Law for deceptive practices.

NGUSA was seeking damages in the form of relief of all contractual obligations, restitution of all amounts paid to Wipro, damages associated with a poor go-live, punitive damages, and attorney's fees and costs associated with the lawsuit.

On June 1, 2018, Wipro filed a motion to dismiss on three of the five causes for claims that it fraudulently misrepresented its capabilities and of negligent misrepresentation. In its response to the NGUSA RFP, Wipro claims that it identified that it had a well-established SAP practice, installed SAP globally for utilities, and had a long-running relationship with U.S. utilities. There was no explicit statement indicating that Wipro had not completed implementations of SAP for U.S.-based utilities nor were specific references provided in this regard.

Wipro also defended much of the language in the RFP response as common puffery, implying that NGUSA had a basic responsibility to check references.

In August 2018 Wipro paid NGUSA $75 million to settle the lawsuit. Wipro states that the settlement has been effected for an amount of US$75 million and is without admission of liability or wrongdoing of any kind by the parties.

NGUSA has been a valued customer of Wipro for over a decade and both organizations have had a mutually beneficial relationship over the years. We believe that this settlement will be commercially beneficial for us and will help us remain focused on growth. – Wipro

So what went wrong?

In my experience with such large projects, it is never one party that is responsible for such a disaster by itself. Where were NGUSA’s project owners? Client project teams have responsibility for signing off on requirements, designs, project strategies, and test results. Did NGUSA provide Wipro with the appropriate access to expert personnel to properly identify requirements? Did Wipro believe that they had accurately captured all requirements based on NGUSA’s sign-off?

In July of 2014, the NorthStar Consulting Group presented their findings of a comprehensive management and operations audit of the U.S. NGUSA companies sponsored by the New York Public Service Commission. The 265-page report (download) covered a broad range of the company’s operations and governance. Throughout the report, the impact of the failed go-live was noted as well as the governance processes that led the company to determine that going live on November 5 was the best decision.

Below are some interesting observations noted in the report.

Design: The system only produced limited reports for management. Most managers have received only highly summarized reports of the costs they are responsible for since the go-live date. November 2013, eight months into the fiscal year, was the first time managers received a detailed cost report that also contained their corresponding budget figures. Some of the lack of reporting was a result of the system design, and many reports that had been provided by the predecessor systems were not provided in the design of the SAP.

Training: Another reason for the lack of information to managers is that the philosophy of information access at the SAP system is that managers are expected to request tailored information and reports from the system with the support of analysts from Decision Support. The lack of staff with the high level of skills necessary to query the data and produce reports for managers has greatly limited the success of this strategy.

Testing: Testing was conducted during each phase of development of the SAP system. One of the lessons learned is that the testing was designed to determine where the system did work rather than identifying the areas where it did not work. Another lesson is that errors were found in the final test stages. Fixes were installed but there was no time for retesting.

Complexity: Building an SAP system requires the development of a series of components commonly referred to as RICEFWs (Reports, Interfaces, Conversions, Enhancements, Forms, Workflows). NGUSA’s design had a total of 636 RICEFWs. As Exhibit IV-5 illustrates, this was a large number for even a large power utility. The NGUSA system design was twice as complex as NGUSA UK’s R1 implementation of SAP and three times as complex as NGUSA UK’s R2 implementation.

Preparation: Pre-implementation, NGUSA did not benefit from the rest of the industry’s SAP lessons learned. NGUSA did not use vendors with a strong track record of U.S. utility industry experience in SAP platform implementation and to date has had almost no interface with other U.S. utilities that have implemented SAP.

Transparency: While problems with system and company readiness were identified by particular groups within NGUSA prior to implementation, that information was subsumed by a push to go live. The overly optimistic risk scoring and executive expectations for the project in its early stages continues with stabilization work.

Validation: During the initial SAP development process, there was minimal interaction with operations personnel regarding desired information or reports. NGUSA implemented a complex field time reporting system without investigating its feasibility given how work is actually performed.

As part of the findings, NorthPoint documented NGUSA’s management observations as to the root cause of the failed implementation. These included:

> Overly ambitious design
> Significantly underestimated scale of transformation needed
> Limited availability of internal personnel due to ambitious business agenda
> Multi-partner model did not deliver business benefits
> Lack of ownership of certain business processes
> Testing less effective than expected due to limited range of scenarios tested and limited data availability
> Inadequate quality of data from legacy systems
> Too much focus on timeline and not enough focus on quality
> Training methods proved ineffective

Conclusion

There are many checkpoints a project the magnitude of NGUSA’s SAP implementation must pass to move forward, each requiring NGUSA to sign off on the quality of the delivered product. There were many opportunities for NGUSA to identify poor-quality talent on the part of Wipro and demand replacements.

The final decision to go live always rests with the client and unless Wipro was looking to deceive NGUSA regarding the results of its testing, NGUSA is partly to blame.

And where was Ernst and Young? EY was providing project management oversight. They clearly have an understanding of what it takes to put in a major SAP implementation. How did they not see or anticipate the major problems that occurred, and how did they fail to warn the NGUSA management team? Or did they?

Where was SAP? In their suit, NGUSA claims that Wipro developed an overly complex system that relied on the development of new capabilities as opposed to using the software as designed. NGUSA identified SAP as providing some level of oversight. Why didn’t SAP point out these significant deviations from standard? Or did they?

Where were the auditors? Projects of this size and impact are often reviewed by both internal and external auditing. Were project reviews performed? Were the appropriate mitigations put in place?

While there are many questions left to be answered about the botched SAP implementation that almost brought down National Grid, one thing is sure. When you start a large project like the implementation of a SAP system, you have to take responsibility and make sure that you as an organization are ready and committed to it.

For more Project Failure Case Studies just click here

Read more…

Monday, March 25, 2019

Cloud computing threats, vulnerabilities and risks

Cloud computing threats, vulnerabilities and risks
As pointed out in a previous article on cloud computing project management, one thing that has changed a lot with the rise of cloud usage is security.

Your cloud computing environment experiences at a high level the same threats as your traditional data center environment. Your threat picture is more or less the same.

Both environments run software, software has vulnerabilities, and adversaries try to exploit those vulnerabilities.

But unlike your systems in a traditional data center, in cloud computing, responsibility for mitigating the risks that result from these software vulnerabilities is shared between the provider and you, the customer.

For that reason, you must understand the division of responsibilities and trust that the provider will hold up their end of the bargain.

This article discusses the 12 biggest threats and vulnerabilities for a cloud computing environment. It splits these into a set of cloud-unique and a set of shared cloud/on-premises vulnerabilities and threats. But before we start, we have to clarify some definitions, because some of the most commonly mixed-up security terms are actually threat, vulnerability, and risk.

Assets, Threats, Vulnerabilities, and Risk

While it might be unreasonable to expect those outside the security industry to understand the differences, more often than not, many in the business use terms such as “asset,” “threat,” “vulnerability,” and “risk” incorrectly or interchangeably. So maybe providing some definitions for those terms will help to make the rest of the article clearer.

Asset – People, property, and information. People may include employees and customers along with other invited persons such as contractors or guests. Property assets consist of both tangible and intangible items that can be assigned a value. Intangible assets include reputation and proprietary information. Information may include databases, software code, critical company records, and many other intangible items. An asset is what we’re trying to protect.

Threat – Anything that can exploit a vulnerability, intentionally or accidentally, and obtain, damage, or destroy an asset. A threat is what we’re trying to protect against.

Vulnerability – Weaknesses or gaps in a security program that can be exploited by threats to gain unauthorized access to an asset. A vulnerability is a weakness or gap in our protection efforts.

Risk – The potential for loss, damage or destruction of an asset as a result of a threat exploiting a vulnerability. Why is it important to understand the difference between these terms? If you don’t understand the difference, you’ll never understand the true risk to assets. You see, when conducting a risk assessment, the formula used to determine risk is:

Asset + Threat + Vulnerability = Risk

Cloud characteristics

While we're defining terms, let’s define cloud computing as well. The most meaningful way to do so in a security context is, in my opinion, by the five cloud computing characteristics published by the National Institute of Standards and Technology (NIST). They are:

1) On-demand self-service: A customer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.

2) Broad network access: Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops and workstations).

3) Resource pooling: The provider's computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to customer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state or datacenter). Examples of resources include storage, processing, memory and network bandwidth.

4) Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the customer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.

5) Measured service: Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth and active user accounts). Resource usage can be monitored, controlled and reported, providing transparency for the provider and customer.

Cloud-specific threats and vulnerabilities

The following vulnerabilities are a result of a cloud service provider’s implementation of the five cloud computing characteristics described above. These vulnerabilities do not exist in classic IT data centers.

#1 Reduced visibility and control

When transitioning your assets/operations to the cloud, your organization loses some visibility and control over those assets/operations. When using external cloud services, the responsibility for some of the policies and infrastructure moves to the provider.

The actual shift of responsibility depends on the cloud service model(s) used, leading to a paradigm shift for customers in relation to security monitoring and logging. Your organization needs to perform monitoring and analysis of information about applications, services, data, and users, without using network-based monitoring and logging, which is available for your on-premises IT.

#2 On-demand self-service

Providers make it very easy to provision new services. The on-demand self-service provisioning features of the cloud enables your organization's employees to provision additional services from the provider without IT consent. This practice of using software in an organization that is not supported by the organization's IT department is commonly referred to as shadow IT.

Due to the lower costs and ease of implementing platform as a service (PaaS) and software as a service (SaaS) products, the probability of unauthorized use of cloud services increases. Services provisioned or used without IT's knowledge present risks to an organization. The use of unauthorized cloud services could result in an increase in malware infections or data exfiltration since your organization is unable to protect resources it does not know about. The use of unauthorized cloud services also decreases your organization's visibility and control of network and data.

#3 Internet-accessible management APIs 

Providers expose a set of application programming interfaces (APIs) that customers use to manage and interact with cloud services (also known as the management plane). Organizations use these APIs to provision, manage, orchestrate, and monitor their assets and users. These APIs can contain the same software vulnerabilities as an API for an operating system, library, etc. Unlike management APIs for on-premises computing, provider APIs are accessible via the Internet, exposing them more broadly to potential exploitation.

Threat actors look for vulnerabilities in management APIs. If discovered, these vulnerabilities can be targeted for successful attacks, and an organization’s cloud assets can be compromised. From there, attackers can use organization assets to perpetrate further attacks against other customers of the provider.

#4 Multi-tenancy 

Exploitation of system and software vulnerabilities within a provider's infrastructure, platforms, or applications that support multi-tenancy can lead to a failure to maintain separation among tenants. This failure can be used by an attacker to gain access from one organization's resource to another user's or organization's assets or data. Multi-tenancy increases the attack surface, leading to an increased chance of data leakage if the separation controls fail.

This attack can be accomplished by exploiting vulnerabilities in the provider's applications, hypervisor, or hardware, subverting logical isolation controls or attacks on the provider's management API.

No reports of an attack based on logical separation failure have been identified; however, proof-of-concept exploits have been demonstrated.

#5 Data deletion 

Threats associated with data deletion exist because the consumer has reduced visibility into where their data is physically stored in the cloud and a reduced ability to verify the secure deletion of their data. This risk is concerning because the data is spread over a number of different storage devices within the provider's infrastructure in a multi-tenancy environment. In addition, deletion procedures may differ from provider to provider. Organizations may not be able to verify that their data was securely deleted and that remnants of the data are not available to attackers. This threat increases as a customer uses more provider services.

Cloud and on-premises threats and vulnerabilities

The following are threats and vulnerabilities that apply to both cloud and on-premises IT data centers that organizations need to address.

#6 Credentials are stolen

If an attacker gains access to one of your user's cloud credentials, the attacker can have access to the provider's services to provision additional resources (if credentials allowed access to provisioning), as well as target your organization's assets. The attacker could leverage cloud computing resources to target your organization's administrative users, other organizations using the same provider, or the provider's administrators. An attacker who gains access to a provider administrator's cloud credentials may be able to use those credentials to access the customers’ systems and data.

Administrator roles vary between a provider and an organization. The provider administrator has access to the provider network, systems, and applications (depending on the service) of the provider's infrastructure, whereas the customer's administrators have access only to the organization's cloud implementations. In essence, the provider administrator has administration rights over more than one customer and supports multiple services.

#7 Vendor lock-in 

Vendor lock-in becomes an issue when your organization considers moving its assets/operations from one provider to another. Your organization will probably discover that the cost/effort/schedule time necessary for the move is much higher than initially considered due to factors such as non-standard data formats, non-standard APIs, and reliance on one provider's proprietary tools and unique APIs.

This issue increases in service models where the provider takes more responsibility. As a customer uses more features, services, or APIs, the exposure to a provider's unique implementations increases. These unique implementations require changes when a capability is moved to a different provider. If a selected provider goes out of business, it becomes a major problem since data can be lost or may not be able to be transferred to another provider in a timely manner.

#8 Increased complexity 

Migrating to the cloud can introduce complexity into IT operations. Managing, integrating, and operating in the cloud may require that the organization's existing IT staff learn a new model. IT staff must have the capacity and skill level to manage, integrate, and maintain the migration of assets and data to the cloud in addition to their current responsibilities for on-premises IT.

Key management and encryption services become more complex in the cloud. The services, techniques, and tools available to log and monitor cloud services typically vary across providers, further increasing complexity. There may also be emergent threats/risks in hybrid cloud implementations due to technology, policies, and implementation methods, which add complexity.

This added complexity leads to an increased potential for security gaps in an agency's cloud and on-premises implementations.

#9 Insider abuse 

Insiders, such as staff and administrators for both organizations and providers, who abuse their authorized access to the organization's or provider's networks, systems, and data are uniquely positioned to cause damage or exfiltrate information.

The impact is most likely worse when using infrastructure as a service (IaaS) due to an insider's ability to provision resources or perform nefarious activities that require forensics for detection. These forensic capabilities may not be available with cloud resources.

#10 Lost data 

Data stored in the cloud can be lost for reasons other than malicious attacks. Accidental deletion of data by the cloud service provider or a physical catastrophe, such as a fire or earthquake, can lead to the permanent loss of customer data. The burden of avoiding data loss does not fall solely on the provider's shoulders. If a customer encrypts its data before uploading it to the cloud but loses the encryption key, the data will be lost. In addition, inadequate understanding of a provider's storage model may result in data loss. Organizations must consider data recovery and be prepared for the possibility of their provider being acquired, changing service offerings, or going bankrupt.

This threat increases as an organization uses more provider services. Recovering data from a provider may be easier than recovering it at an agency because a service level agreement (SLA) designates availability/uptime percentages. These percentages should be investigated when your organization selects a provider.

#11 Provider supply chain 

If your provider outsources parts of its infrastructure, operations, or maintenance, these third parties may not satisfy/support the requirements that the provider is contracted to provide with for organization. Your organization needs to evaluate how the provider enforces compliance and check to see if the provider flows its own requirements down to third parties. If the requirements are not being levied on the supply chain, then the threat to your organization increases.

This threat increases as your organization uses more provider services and is dependent on individual providers and their supply chain policies.

#12 Insufficient due diligence 

Organizations migrating to the cloud often perform insufficient due diligence. They move data to the cloud without understanding the full scope of doing so, the security measures used by the provider, and their own responsibility to provide security measures. They make decisions to use cloud services without fully understanding how those services must be secured.

Conclusion

Although the level of threat in a cloud computing environment is similar to that of a traditional data center, there is a key difference in who is responsible for mitigating the risk. It is important to remember that cloud service providers use a shared responsibility model for security. Your provider accepts responsibility for some aspects of security. Other aspects of security are shared between your provider and you, the customer. And some aspects of security remain the sole responsibility of the consumer. Successful cloud security depends on both parties knowing and meeting all their responsibilities effectively. The failure of organizations to understand or meet their responsibilities is a leading cause of security incidents in cloud computing environments.

Read more…

Monday, March 18, 2019

Project Portfolio Management: Theory vs. Practice

Project Portfolio Management: Theory vs. Practice
If you are responsible for managing portfolios of technology programs and projects, your success in maximizing business outcomes with finite resources is vital to your company’s future in a fast-changing and digital world.

Project portfolio management is the art and science of making decisions about investment mix, operational constraints, resource allocation, project priority and schedule. It is about understanding the strengths and weaknesses of the portfolio, predicting opportunities and threats, matching investments to objectives, and optimizing trade-offs encountered in the attempt to maximize return (i.e., outcomes over investments) at a given appetite for risk (i.e., uncertainty about return).

Most large companies have a project portfolio management process in place, and they mostly follow the traditional project portfolio management process as put on paper by PMI. This process is comprehensible and stable by nature.

Even better, it has the appearance of a marvelous mechanical system that can be followed in a plannable, stable, and reproducible manner. In the end, the project with the greatest strategic contributions always wins the battle for the valuable resources.

Unfortunately, this process does not work well in the real world, despite its apparent elegance. Ultimately, it is characterized by uncertainty, difficulties, ever-changing market environments, and, of course, people—and these do not function like machines.

When we look at technology projects, the primary goal of portfolio executives is to maximize the delivery of technology outputs within budget and schedule. This IT-centric mandate emphasizes output over outcome, and risk over return.

On top of this, the traditional IT financial framework is essentially a cost-recovery model that isn’t suitable for portfolio executives to articulate how to maximize business outcomes on technology investments.

As a result, portfolio management is marginalized to a bureaucratic overhead and a nice-to-have extension of the program and project management function.

So yes, in theory most large organizations have a project portfolio management function in place, but in practice it is far from effective.

Below are 11 key observations I have made in the last few years regarding effective project portfolio management:

1) No data and visibility.

The first theoretical benefit of effective project portfolio management concerns its ability to drive better business decisions. To make good decisions you need good data, and that’s why visibility is so crucial, both from a strategic, top-down perspective and from a tactical, bottom-up perspective.

Anything that can be measured can be improved. However, organizations don’t always do sufficient monitoring. Few organizations actually track project and portfolio performance against their own benchmarks, nor do they track dependencies.

Worse, strategic multiyear initiatives are the least likely to be tracked in a quantitative, objective manner. For smaller organizations, the absence of such a process might be understandable, but for a large organization, tracking is a must.

Not monitoring project results creates a vicious circle: If results are not tracked, then how can the portfolio management and strategic planning process have credibility? It is likely that it doesn’t, and over time, the risk is that estimates are used more as a means of making a project appear worthy of funding than as a mechanism for robust estimation of future results. Without tracking, there is no mechanism to make sure initial estimates of costs and benefits are realistic.

When you have a good handle on past project metrics, it makes it much easier to predict future factors like complexity, duration, risks, expected value, etc. And when you have a good handle on what is happening in your current project portfolio, you can find out which projects are not contributing to your strategy, are hindering other more important projects, or are not contributing enough value.

And once you have this data, don’t keep it in a silo only visible for a select group. All people involved in projects should be able to use this data for their own projects.

2) Many technology projects should not have been started at all.

Big data, blockchain, artificial intelligence, virtual reality, augmented reality, robotics, 5G, machine learning... Billions and billions are poured into projects around these technologies, and for most organizations, not much is coming out of it.

And this is not because these projects are badly managed. Quite simply, it is because they should not have been started in the first place.

I believe that one of the main reasons that many innovative technology projects are started comes down to a fear of missing out, or FOMO.

You may find the deceptively simple but powerful questions in “Stop wasting money on FOMO technology innovation projects” quite useful in testing and refining technology project proposals, clarifying the business case, building support, and ultimately persuading others why they should invest scarce resources in an idea or not.

3) Many projects should have been killed much earlier.

Knowing when to kill a project and how to kill it is important for the success of organizations, project managers and sponsors.

Not every project makes its way to the finish line, and not every project should. As a project manager or sponsor, you’re almost certain to find yourself, at some point in your career, running a project that has no chance of success, or that should never have been initiated in the first place.

The reasons why you should kill a project may vary. It could be the complexity involved, staff resource limitations, unrealistic project expectations, a naive and underdeveloped project plan, the loss of key stakeholders, higher priorities elsewhere, market changes, or some other element. Likely, it will be a combination of some or many of these possibilities.

What’s important is that you do it on time: 17 percent of IT projects fail so badly they can threaten the existence of a company (Calleam).

Keep an eye out for warning signs, ask yourself tough questions, and set aside your ego. By doing so, you can easily identify projects that need to be abandoned right away. You might find “Why killing projects is so hard (and how to do it anyway)” helpful in this process.

4) Project selection is rarely complete and neutral.

This is often because the organization’s strategy is not known, not developed, or cannot be applied to the project (see Observation 10).

But besides this there is the “principal-agent problem.” This means that your managers already know the criteria on which projects will be selected, and so they “optimize” their details accordingly. Even when these details are not “optimized,” this data is collected in an entirely incomplete and inconsistent manner.

And did you ever encounter the situation where projects were already decided on in other rooms than in the one where the decision should have been made? I sure have.

5) Organizations do far too many projects in parallel.

Traditional project portfolio management is all about value optimization and optimizing resource allocation. Both are designed in such a way that, in my opinion, it will result in the opposite. As I (and probably you too) have seen time and again, running projects in an organization at 100 percent utilization is an economic disaster.

Any small amount of unplanned work will cause delays, which will become even worse because of time spent on re-planning, and value is only created when it is delivered and not when it is planned. Hence, we should focus on delivering value as quickly as possible within our given constraints. See “Doing the right number of projects” for more details.

6) Projects are done too slowly.

Too many organizations try to save money on projects (cost efficiency) when the benefits of completing the project earlier far outweigh the potential cost savings. You might, for example, be able to complete a project with perfect resource management (all staff is busy) in 12 months for $1 million. Alternatively, you could hire some extra people and have them sitting around occasionally at a total cost of $1.5 million, but the project would be completed in only six months.

What's that six-month difference worth? Well, if the project is strategic in nature, it could be worth everything. It could mean being first to market with a new product or possessing a required capability for an upcoming bid that you don't even know about yet. It could mean impressing the heck out of some skeptical new client or being prepared for an external audit. There are many scenarios where the benefits outweigh the cost savings (see "Cost of delay" for more details).

On top of delivering the project faster, when you are done after six months instead of 12 months you can use the existing team for a different project, delivering even more benefits for your organization. So not only do you get your benefits for your original project sooner and/or longer, you will get those for your next project sooner as well because it starts earlier and is staffed with an experienced team.

An important goal of your project portfolio management strategy should be to have a high throughput. It’s vital to get projects delivered fast so you start reaping your benefits, and your organization is freed up for new projects to deliver additional benefits.

7) The right projects should have gotten more money, talent and senior management attention.

Partly as a result of observations 5 and 6, but also because of not focusing and agreeing on what the real important projects are, many of them are spread too thin.

The method of always selecting “the next project on the list, from top to bottom, until the budget runs out” does not work as a selection method for the project portfolio. The problem here is that the right resources often receive far too little consideration. Even a rough consideration according to the principle “it looks good overall” can lead to bad bottlenecks in the current year.

Unlike money, people and management attention cannot be moved and scaled at will. This means that bottlenecks quickly become determining factors and conflict with strategic priority and feasibility. In addition, external capacities are not available in the desired quantity. Also, the process of phasing in new employees creates friction, costs time, and temporarily reduces the capacity of the existing team instead of increasing it.

8) Project success is not defined nor measured.

Defining project success is actually one of the largest contributors to project success and I have written many times about it (see here, and here). When starting any project, it's essential to work actively with the organization that owns the project to define success across three levels:

i) Project delivery
ii) Product or service
iii) Business

The process of "success definition" should also cover how the different criteria will be measured (targets, measurements, time, responsible, etc.). Project success may be identified as all points within a certain range of these defined measurements. Success is not just a single point.

The hard part is identifying the criteria, importance, and boundaries of the different success areas. But only when you have done this are you able to manage and identify your projects as a success.

9) Critical assumptions are not validated.

For large or high-risk projects (what is large depends on your organization) it should be mandatory to do an assumption validation before you dive headfirst into executing the project. In this phase you should do a business case validation and/or a technical validation in the form of a proof of concept.

Even if you do this, your project isn’t guaranteed to succeed. The process of validation is just the start. But if you’ve worked through the relevant validations, you’ll be in a far better position to judge if you should stop, continue or change your project.

The goal of the validation phase is to delay the expensive and time-consuming work of projects as late as possible in the process. It’s the best way to keep yourself focused, to minimize costs and to maximize your chance of a successful project. See “No validation? No project!” for more details on this.

10) Your organization has no clear strategy.

Without having a strategy defined and communicated in your organization it is impossible to do effective project portfolio management. I like the definitions of Mintzberg and De Flander regarding this.

“Strategy is a pattern in a stream of decisions.” – Henry Mintzberg            

First, there’s the overall decision—the big choice—that guides all other decisions. To make a big choice, we need to decide who we focus on—our target client segment—and we need to decide how we offer unique value to the customers in our chosen segment. That’s basic business strategy stuff.

But by formulating it this way, it helps us to better understand the second part: the day-to-day decisions—the small choices—that get us closer to the finish line. When these small choices are in line with the big choice, you get a Mintzberg pattern. So if strategy is a decision pattern, strategy execution is enabling people to create a decision pattern. In other words:

“Strategy execution is helping people make small choices in line with a big choice.” – Jeroen De Flander

This notion requires a big shift in the way we typically think about execution. Looking at strategy execution, we should imagine a decision tree rather than an action plan. Decision patterns are at the core of successful strategy journeys, not to-do lists.

To improve the strategy implementation quality, we should shift our energy from asking people to make action plans to help them make better decisions.

11) Ideas are not captured.

Although there is clearly no shortage of ideas within organizations, most organizations unfortunately seldom capture these ideas, except in the few cases where a handful of employees are sufficiently entrepreneurial to drive their own ideas through to implementation. This can happen in spite of the organization, rather than because of it.

Organizations are effective at focusing employees on their daily tasks, roles, and responsibilities. However, they are far less effective at capturing the other output of that process: the ideas and observations that result from it. It is important to remember that these ideas can be more valuable than an employee’s routine work. Putting in an effective process for capturing ideas provides an opportunity for organizations to leverage a resource they already have, already pay for, but fail to capture the full benefit of—namely, employee creativity.

To assume that the best ideas will somehow rise to the top, without formal means to capture them in the first place, is too optimistic. Providing a simplified, streamlined process for idea submission can increase project proposals and result in a better portfolio of projects. Simplification is not about reducing the quality of ideas, but about reducing the bureaucracy associated with producing them. Simplification is not easy, as it involves defining what is really needed before further due diligence is conducted on the project. It also means making the submission process easy to follow and locate, and driving awareness of it.

Conclusion

In the digital age, an effective project portfolio management function is a strategic necessity.

The dilemma of traditional project portfolio management is in granting too little relevance to the actual feasibility at the expense of strategic weighting. In actuality, it is more important to produce a portfolio that, in its entirety, has a real chance of succeeding. It should also be regarded not in terms of a fiscal year, but ideally in much smaller time segments with constant review and the possibility of reprioritization.

Therefore, the question should no longer be “what can we get for this fixed amount of money in the upcoming year,” but rather, “what is the order of priority for us today?”

Here, the perspective moves away from an annually recurring budget process and toward a periodic social exchange of results, knowledge, and modified framework conditions. In the best-case scenario, this penetrates the entire organization, from portfolio to project to daily work.

What do you think?

Read more…

Wednesday, March 13, 2019

Case Study: The epic meltdown of TSB Bank

Case Study: The epic meltdown of TSB Bank
With clients locked out of their bank accounts, mortgage accounts vanishing, small businesses reporting that they could not pay their staff and reports of debit cards ceasing to work, the TSB Bank computer crisis of April 2018 has been one of the worst in recent memory. The bank’s CEO, Paul Pester, admitted in public that the bank was “on its knees” and that it faces a compensation bill likely to run to tens of millions of pounds.

But let’s start from the beginning. First, we’ll examine the background of what led to TSB’s ill-fated system migration. Then, we’ll look at what went wrong and how it could have been prevented.

September 2013

When TSB split from Lloyds Banking Group (LBG) in September 2013, a move forced by the EU as a condition of its taxpayer bailout in 2008, a clone of the original group’s computer system was created and rented to TSB for £100m a year.

That banking system was a combination of many old systems for TSB, BOS, Halifax, Cheltenham & Gloucester, and others that had resulted from the integration of HBOS with Lloyds as a result of the banking crisis.

Under this arrangement, LBG held all the cards. It controlled the system and offered it as a costly service to TSB when it was spun off from LBG.

March 2015

When the Spanish Banco Sabadell bought TSB for £1.7bn in March 2015, it put into motion a plan it had successfully executed in the past for several other smaller banks it had acquired: merge the bank’s IT systems with its own Proteo banking software and, in doing so, save millions in IT costs.

Sabadell was warned in 2015 that its ambitious plan was high risk and that it was likely to cost far more than the £450m Lloyds was contributing to the effort.

“It is not overly generous as a budget for that scale of migration,” John Harvie, a director of the global consultancy firm Protiviti, told the Financial Times in July 2015. But the Proteo system was designed in 2000 specifically to handle mergers such as that of TSB into the Spanish group, and Sabadell pressed ahead.

Summer 2016

By the summer of 2016, work on developing the new system was meant to be well underway and December 2017 was set as a hard-and-fast deadline for delivery.

The time period to develop the new system and migrate TSB over to it was just 18 months. TSB people were saying that Sabadell had done this many times in Spain. But tiny Spanish local banks are not sprawling LBG legacy systems.

To make matters worse, the Sabadell development team did not have full control—and therefore a full understanding—of the system they were trying to migrate client data and systems from because LBG was still the supplier.

Autumn 2017

By the autumn the system was not ready. TSB announced a delay, blaming the possibility of a UK interest rate rise—which did materialize—and the risk that the bank might leave itself unable to offer mortgage quotes over a crucial weekend.

Sabadell pushed back the switchover to April 2018 to try to get the system working. It was an expensive delay because the fees TSB had to pay to LBG to keep using the old IT system were still clocking up: CEO Pester put the bill at £70m.

April 2018

On April 23, Sabadell announced that Proteo4UK—the name given to the TSB version of the Spanish bank’s IT system—was complete, and that 5.4m clients had been “successfully” migrated over to the new system.

Josep Oliu, the chairman of Sabadell, said: “With this migration, Sabadell has proven its technological management capacity, not only in national migrations but also on an international scale.”

The team behind the development were celebrating. In a LinkedIn post that has since been removed, those involved in the migration were describing themselves as “champions,” a “hell of a team,” and were pictured raising glasses of bubbly to cheers of “TSB transfer done and dusted.”

However, only hours after the switch was flicked, systems crumpled and up to 1.9m TSB clients who use internet and mobile banking were locked out.

Twitter had a field day as clients frustrated by the inability to access their accounts or get through to the bank’s call centers started to vent their anger.

Clients reported receiving texts saying their cards had been used abroad; that they had discovered thousands of pounds in their accounts they did not have; or that mortgage accounts had vanished, multiplied or changed currency.

One bemused account holder showed his TSB banking app recording a direct debit paid to Sky Digital 81 years from now. Some saw details of other people’s accounts, and holidaymakers complained that they had been left unable to pay restaurant and hotel bills.

TSB, to clients’ fury, at first insisted the problems were only intermittent. At 3:40 a.m. on Wednesday, April 25, Pester tweeted that the system was “up and running,” only to be forced to apologize the next day and admit it was actually only running at 50 percent capacity.

On Thursday he admitted the bank was on its knees, announced that he was personally seizing control of the attempts to fix the problem from his Spanish masters, and had hired a team from IBM to do the job. Sabadell said it would probably be next week before normal service returned.

The financial ombudsman and the Financial Conduct Authority have launched investigations. The bank has been forced to cancel all overdraft fees for April and raise the interest rate it pays on its classic current account in a bid to stop disillusioned clients from taking their business elsewhere.

The software Pester had boasted about in September of being 2,500 man-years in the making, with more than 1,000 people involved, has been a client service disaster that will cost the bank millions and tarnish its reputation for years.

The basic principles of a system migration

The two main things to avoid in a system migration are an unplanned outage of the service for users and loss of data, either in the sense that unauthorized users have access to data, or in the sense that data is destroyed.

In most cases, outages cannot be justified during business hours, so migrations must typically take place within the limited timeframe of a weekend. To be sure that a migration over a weekend will run smoothly, it is normally necessary to perform one or more trial migrations in non-production environments, that is, migrations to a copy of the live system which is not used by or accessible to real users. The trial migration will expose any problems with the migration process, and these problems can be fixed without any risk of affecting the service to users.

Once the trial migration is complete, has been tested, and any problems with it have been fixed, the live migration can be attempted. For a system of any complexity, the go-live weekend must be carefully pre-planned hour by hour, ensuring that all the correct people are available and know their roles.

As part of the plan, a rollback plan should be put in place. The rollback plan is a planned, rapid way to return to the old system in case anything should go wrong during the live migration. One hopes not to have to use it because the live migration should not normally be attempted unless there has been a successful trial migration and the team is confident that all the problems have been ironed out.

On the go-live weekend, the live system is taken offline, and a period of intense, often round-the-clock, activity begins, following the previously made plan. At a certain point, while there is still time to trigger the rollback plan, a meeting will be held to decide whether to go live with the migration or not (a “go/no go” meeting).

If the migration work has gone well, and the migrated system is passing basic tests (there is no time at that point for full testing; full testing should have been done on the trial migration), the decision will be to go live. If not, the rollback plan will be triggered and the system returned to its previous state, that which was obtained before the go-live weekend.

If the task of migration is so great that it is difficult to fit it into a weekend, even with very good planning and preparation, it may be necessary to break it into phases. The data or applications are broken down into groups which are migrated separately.

This approach reduces the complexity of each group migration compared to one big one, but it also has disadvantages. If the data or applications are interdependent, it may cause performance issues or other technical problems if some are migrated while others remain, especially if the source and destination are physically far apart.

A phased migration will also normally take longer than a single large migration, which will add cost, and it will be necessary to run two data centers in parallel for an extended period, which may add further cost. In TSB’s case, it may have been possible to migrate the clients across in groups, but it is hard to be sure without knowing its systems in detail.

Testing a system migration

Migrations can be expensive because it can take a great deal of time to plan and perform the trial migration(s). With complex migrations, several trial migrations may be necessary before all the problems are ironed out. If the timing of the go-live weekend is tight, which is very likely in a complex migration, it will be necessary to stage some timed trial migrations—“dress rehearsals.” Dress rehearsals are to ensure that all the activities required for the go-live can be performed within the timeframe of a weekend.

Trial migrations should be tested. In other words, once a trial migration has been performed, the migrated system, which will be hosted in a non-production environment, should be tested. The larger and more complex the migration, the greater the requirement for testing. Testing should include functional testing, user acceptance testing and performance testing.

Functional testing of a migration is somewhat different from functional testing of a newly developed piece of software. In a migration, the code itself may be unchanged, and if so there is little value in testing code which is known to work. Instead, it is important to focus the testing on the points of change between the source environment and the target. The points of change typically include the interfaces between each application and whatever other systems it connects to.

In a migration, there is often change in interface parameters used by one system to connect to another, such as IP addresses, database connection strings, and security credentials. The normal way to test the interfaces is to exercise whatever functionality of the application uses the interfaces. Of course, if code changes are necessary as part of a migration, the affected systems should be tested as new software.

In the case of TSB, the migration involved moving client bank accounts from one banking system to another. Although both the source and target systems were mature and well-tested, they had different code bases, and it is likely that the amount of functional testing required would have approached that required for new software.

User acceptance testing is functional testing performed by users. Users know their application well and therefore have an ability to spot errors quickly, or see problems that IT professionals might miss. If users test a trial migration and express themselves satisfied, it is a good sign, but not adequate on its own because, amongst other things, a handful of user acceptance testers will not test performance.

Performance testing checks that the system will work fast enough to satisfy its requirements. In a migration the normal requirement is for there to be little or no performance degradation as a result of the migration. Performance testing is expensive because it requires a full-size simulation of the systems under test, including a full data set.

If the data is sensitive, and in TSB’s case it was, it will be necessary, at significant time and cost, to protect the data by security measures as stringent as those protecting the live data, and sometimes by anonymizing the data. In the case of TSB, the IBM inquiry into what went wrong identified insufficient performance testing as one of the problems.

What went wrong?

Where did it go wrong for TSB? The bank was attempting a very complex operation. There would have been a team of thousands drawn from internal staff, staff from IT service companies, and independent contractors. Their activities would have had to be carefully coordinated, so that they performed the complex set of tasks in the right order to the right standard. Many of them would have been rare specialists. If one such specialist is off sick, it can block the work of hundreds of others. One can imagine that, as the project approached go-live, having been delayed several times before, the trial migrations were largely successful but not perfect.

The senior TSB management would have been faced with a dilemma of whether to accept the risks of doing the live migration without complete testing in the trials, or to postpone go-live by several weeks and report to the board another slippage, and several tens of millions of pounds of further cost overrun. They gambled and lost.

How could TSB have done things differently?

Firstly, a migration should have senior management backing. TSB clearly had it, but with smaller migrations, it is not uncommon for the migration to be some way down senior managers’ priorities. This can lead to system administrators or other actors, whose reporting lines lead elsewhere from those doing the migration, frustrating key parts of the migration because their managers are not ordering them or paying them to cooperate.

Secondly, careful planning and control is essential. It hardly needs saying that it is not possible to manage a complex migration without careful planning and those managing the migration must have an appropriate level of experience and skill. In addition, however, the planning must follow a sound basic approach that includes trial migrations, testing, and rollback plans as described above. While the work is going on, close control is important. Senior management must stay close to what is happening on the ground and be able to react quickly, for example by fast-tracking authorizations, if delays or blockages occur.

Thirdly, there must be a clear policy on risk, and the policy should be stuck to. What criteria must be met for go-live? Once this has been determined, the amount of testing required can be determined. If the tests are not passed, there must be the discipline not to attempt the migration, even if it will cost much more.

Finally, in complex migrations, a phased approach should be considered.

Conclusion

In the case of TSB Bank, the problems that occurred after the live migration were either not spotted in testing, or they were spotted but the management decided to accept the risk and go live anyway. If they were not spotted, it would indicate that testing was not comprehensive enough—IBM specifically pointed to insufficient performance testing. That could be due to a lack of experience among the key managers. If the problems were spotted in testing, it implies weak go-live criteria and/or an inappropriate risk policy. IBM also implied that TSB should have performed a phased migration.

It may be that the public will never fully know what caused TSB’s migration to go wrong, but it sounds like insufficient planning and testing were major factors. Sensitive client data was put at risk, and clients suffered long unplanned outages, resulting in CEO Paul Pester being summoned to the Treasury select committee and the Financial Conduct Authority launching an investigation into the bank. Ultimately Pester lost his job.

When migrating IT systems in the financial sector, cutting corners is dangerous. Ultimately, TSB’s case goes to show that the consequences can be dire. For success one needs to follow some basic principles, use the right people, and be prepared to allocate sufficient time and money to planning and testing. Only then can it be ensured a successful system migration will take place.

For more Project Failure Case Studies just click here

Read more…