Bugs, errors and causes from an 1975 paper

Bugs, errors and causes from an 1975 paper

Half of the errors may result from factors other than the code itself.

Discover the timeless relevance of a 1975 paper that sheds light on the causes of bugs and errors in system programs.

In this article, we delve into Albert Endres' findings and explore how understanding the problem, effective communication, and domain knowledge play crucial roles in reducing errors.

These principles remain highly relevant in current software development practices, where collaboration, clear communication, and comprehensive knowledge of the project's domain could be crucial for the success of the product.

About the paper

In 1975, Albert Endres published a paper called "An Analysis of Errors and their Causes in System Programs".

In this paper, they analyze the errors detected in an operating system DOS/VS developed by IBM Laboratories in Germany.

The system was released in 1973. The paper analyses:

- 500 modules, with an average of 360 lines per module and 480 lines of comments, resulting in 190K instructions and 60K comments.

- A total of 740 problems were found, of which 432 were classified as program errors.

Areas where errors where found

And here are some of the conclusions from this paper:

"Almost half of all errors are found in the area of understanding the problem, problem communication and of the knowledge of possibilities and procedures for problem-solving"

If we want to reduce the number of errors then we should focus not only on better programming techniques but also on the problem definition, and understanding domain knowledge:

"This fact is alarming or encouraging, depending on the expectations we had for a hundred percent automation of software production.  More specifically, only half of the mistakes can be avoided with better programming techniques (better programming languages, more comprehensive test tools). The other half must be attacked with better methods of problem definition (specification languages), a better understanding of basic system concepts (training, education), and by making applicable algorithms available"

And based on the paper here are the categories of causes for errors or bugs:

1. Technological = "definability of the problem, feasibility of solving it, available procedures and tools"

2. Organisational = "division of work load, available in- formation, communication, resources"

3. Historic = "history of the project, of the program, special situations, and external influences"

4. Group dynamic = "willingness to cooperate, distribution of roles inside the project group"

5. Individual = "experience, talent, and constitution of the individual programmer"

6. Other

Some conclusions:

  1. When considering quality in a software product, we should consider all aspects, not only the tools and programming techniques we use.

  2. When joining a product or team, it is important to understand the history of the product & team.

  3. To become a better developer one has to learn more than coding/engineering techniques.

  4. When we think about a possible future of using AI to augment programming, we should consider that if this study remains true it is not enough to reduce the number of errors. As half of them are not caused by code.


Enjoyed this article?

Join my Short Ruby News newsletter for weekly Ruby updates. For more Ruby learning resources, visit rubyandrails.info

Did you find this article valuable?

Support Lucian Ghinda by becoming a sponsor. Any amount is appreciated!