By William Fellows Unix server vendors are looking for ways to sell data centers alongside or as replacements for mainframes, but they have a long way to go before they can duplicate the reliability of big old iron. The Uptime Institute, a 45-member Santa Fe, New Mexico- based consortium dedicated to helping companies improve the […]
By William Fellows
Unix server vendors are looking for ways to sell data centers alongside or as replacements for mainframes, but they have a long way to go before they can duplicate the reliability of big old iron. The Uptime Institute, a 45-member Santa Fe, New Mexico- based consortium dedicated to helping companies improve the availability of IT services to their users, believes the current spat between the Unix vendors over how many hours of uptime a system can provide between stoppages completely misses the point and is a irrelevant. Of course manufacturers to a large extent only have direct control over the product itself, which accounts for their emphasis on fairly specific percentage metrics. However, the Institute’s executive director Ken Brill argues 80% of downtime occurs because of people and process failure, not system problems. In its uptime numbers a vendor would not likely include the five hours it might take an engineer to reach a customer – which its system is not directly responsible for – but only the five minutes of work required to fix the problem. Look at the current rash of outages among the online brokerages, he urges. E*trade’s widely-publicized downtime was due to software upgrades. People did not take care to properly evaluate the consequences of changing the software. The goal, says Brill is to know and control the processes which could lead to system downtime. The focus should be to improve the quality of the service delivered to the user and not just on eliminating downtime. Brill points to a satisfaction study Boeing Co carried out of some of its internal systems which it had engineered to the point where they never went down. Users still said the service sucks, Brill observes. The irony is that these processes, such as change management, have been known and managed for 40 years in the mainframe world but have not yet been applied to Unix servers. But that’s about to change, Brill claims. In large part, he says, because Sun Microsystems Inc has taken a gigantic, visionary step which will tie the quality of service its customers get directly to its rewards and compensation programs through a program it calls SunUp. Moreover, not only is Sun changing the way it looks at the product away from a focus on the product itself but it is also extending the SunUp program to measure the availability of all services, not just its own equipment. Sun has asked the Institute to apply some of the concepts it has developed and apply them directly to infrastructure such as power supplies and cooling equipment. Take the existing availability and improve it, says Brill.