Source code analysis software vendor Coverity Inc has released the first results of its Homeland Security-backed open source code-scanning project, providing a baseline of code quality with which to judge open source code.
The San Francisco, California-based company has published the first results of its open source code analysis study as part of the three-year Vulnerability Discovery and Remediation Open Source Hardening Project, launched in January.
It provides a baseline for security and quality in open source software, with an average defect rate of 0.434 per thousand lines of code across the 32 different open source projects tested, and indicates that the LAMP stack (Linux, Apache, MySQL, PHP/Perl/Python) is of a higher than average quality, with just 0.290 defects per thousand.
The study was carried out at Stanford University using Coverity’s Prevent Source code analysis tool and is part of the Department of Homeland Security’s Science and Technology Directorate initiative to develop technologies to protect the nation’s telecommunication infrastructure.
The Coverity tools represent the commercialization of a source code defect discovery approach created at Stanford. While the company admits that no automated source code tool is does now believe that code comparison is possible and practical.
One of the goals of our research on software quality and security is to define a baseline so that people can measure software reliability in both open source and proprietary software projects, said Ben Chelf, CTO of Coverity. No technology can find all bugs in software, but we have collected a critical mass of data through an automated and repeatable analysis framework to show how software quality can be concretely assessed, compared, and ultimately improved.
The analysis of 32 open source software projects involved the scanning of 17.5 million lines of code across the likes of Linux, Apache, MySQL, PostgreSQL, OpenSSL, OpenLDAP, Firefox, Samba, Python, Perl, PHP, Gnome, and FreeBSD.
Linux had a defect density of 0.335, compared to Apache with 0.250, MySQL with 0.224, PHP with 0.474, Perl with 0.186, and Python with 0.372. The lowest defect density was 0.051 for the XMMS (X Multimedia system) project, while the highest was 1.237 for the Amanda backup and recovery project.
As well as identifying the problems, Coverity has also passed the details on to project maintainers, and so expects the quality of the baseline to improve over time. However, as the project continues it also hopes to be able to use the data to answer further questions relating to what defects are considered critical and how the quality of development code compares to stable code.
The results will be continually updated and can be found at http://scan.coverity.com/.