“All the hallmarks of business management strong-arming IT into an unrealistic timeline…”
A report by law firm Slaughter and May into TSB’s botched migration of a core banking system in 2018 reveals major tensions between the bank and the law firm over its findings, published Tuesday.
The series of outages resulting from the IT project left customers unable to access their accounts, drove a surge in fraud, and cost the bank £366 million and tens of thousands of customers.
The damning 262-page autopsy does not, TSB’s board chairman Richard Meddings said, “paint the full picture of migration”.
TSB’s Slaughter and May Report: 5 Key Takeaways
1: “Obstacles” Thrown Up to Analysis of Defects in the Proteo4UK Platform
The report suggests TSB was reluctant to give Slaughter and May full access to everything it needed to conduct a forensic investigation.
The law firm said: “We have faced a number of obstacles and difficulties in attempting to carry out an analysis of the defects [Ed: deviations from required functionality, or change requests for missing functionality] in the [replacement] Proteo4UK Platform during our review.”
TSB and Banco Sabadell tech subisidiary Sabis were using JIRA to track tickets raised to track and fix issues with the core banking platform, but it took Slaughter and May two months to gain direct access to JIRA after having initially been given “largely incomplete and unworkable” extracts, following a September 13, 2018 request, the law firm said.
Even then, it struggled to get the data it needed.
“We received direct access to JIRA on 9 November 2018, but we were still unable to extract a full data set due to further issues with the completeness of the data provided and certain other [unspecified] issues”, the company notes.
“We finally received what we were told was the complete data set on 22 November 2018”, the report says, with a soupcon of scepticism…
2: The Bank and Law Firm Disagree Over “Defects”
Slaughter and May identified 34,671 “functional defects” being opened in JIRA between October 2017 and October 2018.
On April 10 2018, 12 days before the planned migration was due to go live, the programme still had a defect backlog of 5,359 defects, 840 of which were classed severity one or two; the two highest levels.
A whopping 4,424 were still open when the new system went live.
TSB disagrees with this finding and after being given a draft, set off to produce its own analysis. This returned the result of 4,396; with TSB saying that only 98 of these were applicable to the migration programme.
Slaughter and May bluntly disagrees: “TSB only arrived at this number by inappropriately excluding a number of major categories of defect… the new platform was not ready to support TSB’s full customer base and [TSB owner] Sabis was not ready to operate the new platform.”
3: TSB’s Board was Napping
Despite being a project that involved 70 third-party suppliers and 1,400 people, TSB’s owners planned to conduct the entire migration over a single weekend.
Slaughter and May says discussions over whether this “big bang” approach was the right one were not substantial. (And testing was inadequate in the run-up was “flawed” and “not sufficient” to mitigate the risk of putting five million customers on a platform “consisting largely of new software”.)
“TSB did not give sufficient consideration to whether a largely single event migration was the right choice, what the risks of this approach would be, or how those risks would be mitigated” Slaughter and May said.
“This choice was not substantively discussed by TSB’s board. In addition, it appears that neither TSB’s board nor the executive requested or received any advice from their external advisers.”
4: Data Centre Issues and Inadequate Testing
There was, as a result, issues “with the configuration of two data centres, which contributed significantly to the problems experienced by TSB’s customers. As the decision was taking to conduct performance testing on a single data centre, it was impossible to identify these issues before Go Live.”
As one of the data centres was “reserved for live purposes” the complete infrastructure was never tested at load, including tests of both data centres working in Active/Active move, Slaughter and May notes
(After performance tests for internet banking did not pass the original target load meanwhile, the bank simply moved the goalposts; lowering test targets).
5: No Performance Testing Environment
Astonishingly, there was no dedicated environment for performance testing!
As the report notes: “TSB and Sabis decided to conduct this type of NFT [non-functional testing] in the production environment, even though that environment was being used to support services which had gone live.”
“As these live services had to be protected, this constrained the kinds of testing that could be done.”
TSB’s Richard Meddings said: “When we commissioned Slaughter and May to carry out this review, we specifically asked for a fully independent and thorough inquiry. Although the report doesn’t paint the full picture of migration, the Board were absolutely clear that we wanted to be transparent and learn fully from those aspects which went wrong. That is why we have taken the decision to publish this report in full.”
He added: “Importantly, TSB has evolved to be a better business than the newly created bank which began the migration project. We have already made major changes as a result of what we have learned, including moving to take direct control of our IT operations. With the leadership of Debbie Crosbie as our CEO, we are now well on track to get TSB back to what it does best: serving customers and bringing better choices to UK banking.”
Lev Lesokhin, SVP of Strategy and Analytics as software analytics firm CAST, commented: “While TSB left one data centre unchecked, it’s really about the applications that run in that data centre. In the course of a large IT integration the inter-component and inter-layer dependencies can trigger an avalanche of unforeseen problems. Understanding how all systems will function when brought together is very difficult without having machine-assisted software intelligence.
“This situation has all the hallmarks of business management strong-arming the IT organization into an unrealistic timeline. When business leaders push for overly-aggressive timelines, or regulators ask for multiple competing risk frameworks and excessive after-the-fact incident reporting, this all puts a strain on the delivery organization’s ability to untangle the complexity before ‘go live’. Business leaders must pay closer attention to IT estimates, especially those based on solid software intelligence, to drive more realistic timelines for mass systems transformations.”