Major Software Bugs


some recent major computer system failures caused by software bugs

 

Media reports in January of 2005 detailed severe problems with a $170 million high-profile U.S. government IT systems project. Software testing was one of the five major problem areas according to a report of the commission reviewing the project. Studies were under way to determine which, if any, portions of the project could be salvaged

In July 2004 newspapers reported that a new government welfare management system in Canada costing several hundred million dollars was unable to handle a simple benefits rate increase after being put into live operation. Reportedly the original contract allowed for only 6 weeks of acceptance testing and the system was never tested for its ability to handle a rate increase. 

Millions of bank accounts were impacted by errors due to installation of inadequately tested software code in the transaction processing system of a major North American bank, according to mid-2004 news reports. Articles about the incident stated that it took two weeks to fix all the resulting errors, that additional problems resulted when the incident drew a large number of e-mail phishing attacks against the bank's customers, and that the total cost of the incident could exceed $100 million. 

A bug in site management software utilized by companies with a significant percentage of worldwide web traffic was reported in May of 2004. The bug resulted in performance problems for many of the sites simultaneously and required disabling of the software until the bug was fixed. 

According to news reports in April of 2004, a software bug was determined to be a major contributor to the 2003 Northeast blackout, the worst power system failure in North American history. The failure involved loss of electrical power to 50 million customers, forced shutdown of 100 power plants, and economic losses estimated at $6 billion. The bug was reportedly in one utility company's vendor-supplied power monitoring and management system, which was unable to correctly handle and report on an unusual confluence of initially localized events. The error was found and corrected after examining millions of lines of code. 

In early 2004, news reports revealed the intentional use of a software bug as a counter-espionage tool. According to the report, in the early 1980's one nation surreptitiously allowed a hostile nation's espionage service to steal a version of sophisticated industrial software that had intentionally-added flaws. This eventually resulted in major industrial disruption in the country that used the stolen flawed software. 

A major U.S. retailer was reportedly hit with a large government fine in October of 2003 due to web site errors that enabled customers to view one anothers' online orders. 

* News stories in the fall of 2003 stated that a manufacturing company recalled all their transportation products in order to fix a software problem causing instability in certain circumstances. The company found and reported the bug itself and initiated the recall procedure in which a software upgrade fixed the problems. 

* In January of 2001 newspapers reported that a major European railroad was hit by the aftereffects of the Y2K bug. The company found that many of their newer trains would not run due to their inability to recognize the date '31/12/2000'; the trains were started by altering the control system's date settings. 

News reports in September of 2000 told of a software vendor settling a lawsuit with a large mortgage lender; the vendor had reportedly delivered an online mortgage processing system that did not meet specifications, was delivered late, and didn't work. 

* In early 2000, major problems were reported with a new computer system in a large suburban U.S. public school district with 100,000+ students; problems included 10,000 erroneous report cards and students left stranded by failed class registration systems; the district's CIO was fired. The school district decided to reinstate it's original 25-year old system for at least a year until the bugs were worked out of the new system by the software vendors. 

* In October of 1999 the $125 million NASA Mars Climate Orbiter spacecraft was believed to be lost in space due to a simple data conversion error. It was determined that spacecraft software used certain data in English units that should have been in metric units. Among other tasks, the orbiter was to serve as a communications relay for the Mars Polar Lander mission, which failed for unknown reasons in December 1999. Several investigating panels were convened to determine the process failures that allowed the error to go undetected. 

* Bugs in software supporting a large commercial high-speed data network affected 70,000 business customers over a period of 8 days in August of 1999. Among those affected was the electronic trading system of the largest U.S. futures exchange, which was shut down for most of a week as a result of the outages. 

* January 1998 news reports told of software problems at a major U.S. telecommunications company that resulted in no charges for long distance calls for a month for 400,000 customers. The problem went undetected until customers called up with questions about their bills.

Testing Procedure


What steps are needed to develop and run software tests?
 

The following are some of the steps to consider: 

* Obtain requirements, functional design, and internal design specifications and other necessary documents. 

* Obtain budget and schedule requirements.Determine project-related personnel and their responsibilities, reporting requirements, required standards and processes (such as release processes, change processes, etc.) 

* Identify application's higher-risk aspects, set priorities, and determine scope and limitations of tests. 

* Determine test approaches and methods - unit, integration, functional, system, load, usability tests, etc. 

* Determine test environment requirements (hardware, software, communications, etc.) 

* Determine testware requirements (record/playback tools, coverage analyzers, test tracking, problem/bug tracking, etc.) 

* Determine test input data requirements 

* Identify tasks, those responsible for tasks, and labor requirements 

* Set schedule estimates, timelines, milestones 

* Determine input equivalence classes, boundary value analyses, error classes 

* Prepare test plan document and have needed reviews/approvals 

* Write test cases 

* Have needed reviews/inspections/approvals of test cases 

* Prepare test environment and testware, obtain needed user manuals/reference documents/configuration guides/installation guides, set up test tracking processes, set up logging and archiving processes, set up or obtain test input data 

* Obtain and install software releases 

* Perform tests 

* Evaluate and report results 

* Track problems/bugs and fixes 

* Retest as needed 

* Maintain and update test plans, test cases, test environment, and testware through life cycle 

Bug Tracking 

What's a 'test case'? 

* A test case is a document that describes an input, action, or event and an expected response, to determine if a feature of an application is working correctly. A test case should contain particulars such as test case identifier, test case name, objective, test conditions/setup, input data requirements, steps, and expected results. 

* Note that the process of developing test cases can help find problems in the requirements or design of an application, since it requires completely thinking through the operation of the application. For this reason, it's useful to prepare test cases early in the development cycle if possible. 

What should be done after a bug is found? 

* The bug needs to be communicated and assigned to developers that can fix it. After the problem is resolved, fixes should be re-tested, and determinations made regarding requirements for regression testing to check that fixes didn't create problems elsewhere. If a problem-tracking system is in place, it should encapsulate these processes. A variety of commercial problem-tracking/management software tools are available (see the 'Tools' section for web resources with listings of such tools). The following are items to consider in the tracking process: 

* Complete information such that developers can understand the bug, get an idea of it's severity, and reproduce it if necessary. 

* Bug identifier (number, ID, etc.) 

* Current bug status (e.g., 'Released for Retest', 'New', etc.) 

* The application name or identifier and version 

* The function, module, feature, object, screen, etc. where the bug occurred 

* Environment specifics, system, platform, relevant hardware specifics 

* Test case name/number/identifier 

* One-line bug description 

* Full bug description 

* Description of steps needed to reproduce the bug if not covered by a test case or if the developer doesn't have easy access to the test case/test script/test tool 

* Names and/or descriptions of file/data/messages/etc. used in test 

* File excerpts/error messages/log file excerpts/screen shots/test tool logs that would be helpful in finding the cause of the problem 

* Severity estimate (a 5-level range such as 1-5 or 'critical'-to-'low' is common 

* Was the bug reproducible? 

* Tester name 

* Test date 

* Bug reporting date 

* Name of developer/group/organization the problem is assigned to 

* Description of problem cause 

* Description of fix 

* Code section/file/module/class/method that was fixed 

* Date of fix 

* Application version that contains the fix 

* Tester responsible for retest 

* Retest date 

* Retest results 

* Regression testing requirements 

* Tester responsible for regression tests 

* Regression testing results 

* A reporting or tracking process should enable notification of appropriate personnel at various stages. For instance, testers need to know when retesting is needed, developers need to know when bugs are found and how to get the needed information, and reporting/summary capabilities are needed for managers. 

Why does software have bugs? 


* Miscommunication or no communication - as to specifics of what an application should or shouldn't do (the application's requirements). 

* Software complexity - the complexity of current software applications can be difficult to comprehend for anyone without experience in modern-day software development. Windows-type interfaces, client-server and distributed applications, data communications, enormous relational databases, and sheer size of applications have all contributed to the exponential growth in software/system complexity. And the use of object-oriented techniques can complicate instead of simplify a project unless it is well engineered. 

* Programming errors - programmers, like anyone else, can make mistakes. 

* Changing requirements - the customer may not understand the effects of changes, or may understand and request them anyway - redesign, rescheduling of engineers, effects on other projects, work already completed that may have to be redone or thrown out, hardware requirements that may be affected, etc. If there are many minor changes or any major changes, known and unknown dependencies among parts of the project are likely to interact and cause problems, and the complexity of keeping track of changes may result in errors. Enthusiasm of engineering staff may be affected. In some fast-changing business environments, continuously modified requirements may be a fact of life. In this case, management must understand the resulting risks, and QA and test engineers must adapt and plan for continuous extensive testing to keep the inevitable bugs from running out of control. 

* time pressures - scheduling of software projects is difficult at best, often requiring a lot of guesswork. When deadlines loom and the crunch comes, mistakes will be made. 

* egos - people prefer to say things like: 

* 'no problem' 

* 'piece of cake' 

* 'I can whip that out in a few hours' 

* 'it should be easy to update that old code' 

* instead of: 

* 'that adds a lot of complexity and we could end up 

* making a lot of mistakes' 

* 'we have no idea if we can do that; we'll wing it' 

* 'I can't estimate how long it will take, until I 

* take a close look at it' 

* 'we can't figure out what that old spaghetti code 

* did in the first place' 

* If there are too many unrealistic 'no problem's', the result is bugs. 

* poorly documented code - it's tough to maintain and modify code that is badly written or poorly documented; the result is bugs. In many organizations management provides no incentive for programmers to document their code or write clear, understandable code. In fact, it's usually the opposite: they get points mostly for quickly turning out code, and there's job security if nobody else can understand it ('if it was hard to write, it should be hard to read'). 

* software development tools - visual tools, class libraries, compilers, scripting tools, etc. often introduce their own bugs or are poorly documented, resulting in added bugs.

 

Test Cases, Suits, Scripts and Scenario

Black box testers usually write test cases for the majority of their testing activities. A test case is usually a single step, and its expected result, along with various additional pieces of information. It can occasionally be a series of steps but with one expected result or expected outcome. The optional fields are a test case ID, test step or order of execution number, related requirement(s), depth, test category, author, and check boxes for whether the test is automatable and has been automated. Larger test cases may also contain prerequisite states or steps, and descriptions. A test case should also contain a place for the actual result. These steps can be stored in a word processor document, spreadsheet, database or other common repository. In a database system, you may also be able to see past test results and who generated the results and the system configuration used to generate those results. These past results would usually be stored in a separate table. 

The most common term for a collection of test cases is a test suite. The test suite often also contains more detailed instructions or goals for each collection of test cases. It definitely contains a section where the tester identifies the system configuration used during testing. A group of test cases may also contain prerequisite states or steps, and descriptions of the following tests. 

Collections of test cases are sometimes incorrectly termed a test plan. They may also be called a test script, or even a test scenario. 

Most white box tester write and use test scripts in unit, system, and regression testing. Test scripts should be written for modules with the highest risk of failure and the highest impact if the risk becomes an issue. Most companies that use automated testing will call the code that is used their test scripts. 

A scenario test is a test based on a hypothetical story used to help a person think through a complex problem or system. They can be as simple as a diagram for a testing environment or they could be a description written in prose. The ideal scenario test has five key characteristics. It is (a) a story that is (b) motivating, (c) credible, (d) complex, and (e) easy to evaluate. They are usually different from test cases in that test cases are single steps and scenarios cover a number of steps. Test suites and scenarios can be used in concert for complete system tests. 

Scenario testing is similar to, but not the same as session-based testing, which is more closely related to exploratory testing, but the two concepts can be used in conjunction.

 

    

Manual Testing Basics

* In India itself, Software industry growth has been phenomenal. 
* IT field has enormously grown in the past 50 years. 
* IT industry in India is expected to touch 10,000 crores of which software share is dramatically increasing. 

Software Crisis 
* Software cost/schedules are grossly inaccurate. Cost overruns of several times, schedule slippage’s by months, or even years are common. 
* Productivity of people has not kept pace with demand Added to it is the shortage of skilled people. 
* Productivity of people has not kept pace with demand Added to it is the shortage of skilled people. 

Software Myths 
Management Myths 
* Software Management is different. 
* Why change or approach to development? 
* We have provided the state-of-the-art hardware. 
* Problems are technical 
* If project is late, add more engineers. 
* We need better people. 

Developers Myths 
* We must start with firm requirements 
* Why bother about Software Engineering techniques, I will go to terminal and code it. 
* Once coding is complete, my job is done. 
* How can you measure the quality..it is so intangible. 

Customer’s Myth 
* A general statement of objective is good enough to produce software. 
* Anyway software is “Flexware”, it can accommodate my changing needs. 

What do we do ? 
* Use Software Engineering techniques/processes. 
* Institutionalize them and make them as part of your development culture. 
* Adopt Quality Assurance Frameworks : ISO, CMM 
* Choose the one that meets your requirements and adopt where necessary. 

Software Quality Assurance: 
* The purpose of Software Quality Assurance is to provide management with appropriate visibility into the process being used by the software project and of the products being built. 
* Software Quality Assurance involves reviewing and auditing the software products and activities to verify that they comply with the applicable procedures and standards and providing the software project and other appropriate managers with the results of these reviews and audits. 

Verification: 
* Verification typically involves reviews and meetings to evaluate documents, plans, code, requirements, and specifications. 
* The determination of consistency, correctness & completeness of a program at each stage. 

Validation: 
* Validation typically involves actual testing and takes place after verifications are completed 
* The determination of correctness of a final program with respect to its requirements. 
Software Life Cycle Models : 
* Prototyping Model 
* Waterfall Model – Sequential 
* Spiral Model 
* V Model - Sequential 

What makes a good Software QA engineer? 
* The same qualities a good tester has are useful for a QA engineer. Additionally, they must be able to understand the entire software development process and how it can fit into the business approach and goals of the organization. Communication skills and the ability to understand various sides of issues are important. In organizations in the early stages of implementing QA processes, patience and diplomacy are especially needed. An ability to find problems as well as to see 'what's missing' is important for inspections and reviews. 

Testing: 
* An examination of the behavior of a program by executing on sample data sets. 
* Testing comprises of set of activities to detect defects in a produced material. 
* To unearth & correct defects. 
* To detect defects early & to reduce cost of defect fixing. 
* To avoid user detecting problems. 
* To ensure that product works as users expected it to. 

Why Testing? 
* To unearth and correct defects. 
* To detect defects early and to reduce cost of defect fixing. 
* To ensure that product works as user expected it to. 
* To avoid user detecting problems. 

Test Life Cycle 
* Identify Test Candidates 
* Test Plan 
* Design Test Cases 
* Execute Tests 
* Evaluate Results 
* Document Test Results 
* Casual Analysis/ Preparation of Validation Reports 
* Regression Testing / Follow up on reported bugs. 

Testing Techniques 
* Black Box Testing 
* White Box Testing 
* Regression Testing 
* These principles & techniques can be applied to any type of testing.

Black Box Testing 
* Testing of a function without knowing internal structure of the program. 

White Box Testing 
* Testing of a function with knowing internal structure of the program. 

Regression Testing 
* To ensure that the code changes have not had an adverse affect to the other modules or on existing functions. 

Functional Testing 
* Study SRS 
* Identify Unit Functions 
* For each unit function 
* - Take each input function 
* - Identify Equivalence class 
* - Form Test cases 
* - Form Test cases for boundary values 
* - From Test cases for Error Guessing 
* Form Unit function v/s Test cases, Cross Reference Matrix 
* Find the coverage 

Unit Testing: 
* The most 'micro' scale of testing to test particular functions or code modules. Typically done by the programmer and not by testers . 
* Unit - smallest testable piece of software. 
* A unit can be compiled/ assembled/ linked/ loaded; and put under a test harness. 
* Unit testing done to show that the unit does not satisfy the functional specification and/ or its implemented structure does not match the intended design structure. 

Integration Testing: 
* Integration is a systematic approach to build the complete software structure specified in the design from unit-tested modules. There are two ways integration performed. It is called Pre-test and Pro-test. 
* Pre-test: the testing performed in Module development area is called Pre-test. The Pre-test is required only if the development is done in module development area. 

Alpha testing: 
* Testing of an application when development is nearing completion minor design changes may still be made as a result of such testing. Typically done by end-users or others, not by programmers or testers. 

Beta testing: 
* Testing when development and testing are essentially completed and final bugs and problems need to be found before final release. Typically done by end-users or others, not by programmers. 

System Testing: 
* A system is the big component. 
* System testing is aimed at revealing bugs that cannot be attributed to a component as such, to inconsistencies between components or planned interactions between components. 
* Concern: issues, behaviors that can only be exposed by testing the entire integrated system (e.g., performance, security, recovery). 

Volume Testing: 
* The purpose of Volume Testing is to find weaknesses in the system with respect to its handling of large amounts of data during short time periods. For example, this kind of testing ensures that the system will process data across physical and logical boundaries such as across servers and across disk partitions on one server. 

Stress testing: 
* This refers to testing system functionality while the system is under unusually heavy or peak load; it's similar to the validation testing mentioned previously but is carried out in a "high-stress" environment. This requires that you make some predictions about expected load levels of your Web site. 

Usability testing: 
* Usability means that systems are easy and fast to learn, efficient to use, easy to remember, cause no operating errors and offer a high degree of satisfaction for the user. Usability means bringing the usage perspective into focus, the side towards the user. 

Security testing: 
* If your site requires firewalls, encryption, user authentication, financial transactions, or access to databases with sensitive data, you may need to test these and also test your site's overall protection against unauthorized internal or external access. 

Test Plan: 
* A Test Plan is a detailed project plan for testing, covering the scope of testing, the methodology to be used, the tasks to be performed, resources, schedules, risks, and dependencies. A Test Plan is developed prior to the implementation of a project to provide a well defined and understood project roadmap. 

Test Specification: 
* A Test Specification defines exactly what tests will be performed and what their scope and objectives will be. A Test Specification is produced as the first step in implementing a Test Plan, prior to the onset of manual testing and/or automated test suite development. It provides a repeatable, comprehensive definition of a testing campaign.

 

             

Fuzz Testing

Fuzz testing is a software testing technique. The basic idea is to attach the inputs of a program to a source of random data. If the program fails (for example, by crashing, or by failing in-built code assertions), then there are defects to correct. 
The great advantage of fuzz testing is that the test design is extremely simple, and free of preconceptions about system behavior. 

Uses 

Fuzz testing is often used in large software development projects that perform black box testing. These usually have a budget to develop test tools, and fuzz testing is one of the techniques which offers a high benefit:cost ratio. 

Fuzz testing is also used as a gross measurement of a large software system's quality. The advantage here is that the cost of generating the tests is relatively low. For example, third party testers have used fuzz testing to evaluate the relative merits of different operating systems and application programs. 

Fuzz testing is thought to enhance software security and software safety because it often finds odd oversights and defects which human testers would fail to find, and even careful human test designers would fail to create tests for. 

However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only provide a random sample of the system's behavior, and in many cases passing a fuzz test may only demonstrate that a piece of software handles exceptions without crashing, rather than behaving correctly. Thus, fuzz testing can only be regarded as a proxy for program correctness, rather than a direct measure, with fuzz test failures actually being more useful as a bug-finding tool than fuzz test passes as an assurance of quality. 

Fuzz testing methods 

As a practical matter, developers need to reproduce errors in order to fix them. For this reason, almost all fuzz testing makes a record of the data it manufactures, usually before applying it to the software, so that if the computer fails dramatically, the test data is preserved. 
Modern software has several different types of inputs: 

* Event driven inputs are usually from a graphical user interface, or possibly from a mechanism in an embedded system. 

* Character driven inputs are from files, or data streams. 

* Database inputs are from tabular data, such as relational databases. 

There are at least two different forms of fuzz testing: 

* Valid fuzz attempts to assure that the random input is reasonable, or conforms to actual production data. 

* Simple fuzz usually uses a pseudo random number generator to provide input. 

* An combined approach uses valid test data with some proportion of totally random input injected. 

By using all of these techniques in combination, fuzz-generated randomness can test the un-designed behavior surrounding a wider range of designed system states. 

Fuzz testing may use tools to simulate all of these domains. 

Event-driven fuzz 

Normally this is provided as a queue of datastructures. The queue is filled with data structures that have random values. 

The most common problem with an event-driven program is that it will often simply use the data in the queue, without even crude validation. To succeed in a fuzz-tested environment, software must validate all fields of every queue entry, decode every possible binary value, and then ignore impossible requests. 

One of the more interesting issues with real-time event handling is that if error reporting is too verbose, simply providing error status can cause resource problems or a crash. Robust error detection systems will report only the most significant, or most recent error over a period of time. 

Character-driven fuzz 

Normally this is provided as a stream of random data. The classic source in UNIX is the random data generator. 

One common problem with a character driven program is a buffer overrun, when the character data exceeds the available buffer space. This problem tends to recur in every instance in which a string or number is parsed from the data stream and placed in a limited-size area. 

Another is that decode tables or logic may be incomplete, not handling every possible binary value. 

Database fuzz 

The standard database scheme is usually filled with fuzz that is random data of random sizes. Some IT shops use software tools to migrate and manipulate such databases. Often the same schema descriptions can be used to automatically generate fuzz databases. 

Database fuzz is controversial, because input and comparison constraints reduce the invalid data in a database. However, often the database is more tolerant of odd data than its client software, and a general-purpose interface is available to users. Since major customer and enterprise management software is starting to be open-source, database-based security attacks are becoming more credible. 

A common problem with fuzz databases is buffer overrun. A common data dictionary, with some form of automated enforcement is quite helpful and entirely possible. To enforce this, normally all the database clients need to be recompiled and retested at the same time. Another common problem is that database clients may not enderstand the binary possibilities of the database field type, or, legacy software might have been ported to a new database system with different possible binary values. A normal, inexpensive solution is to have each program validate database inputs in the same fashion as user inputs. The normal way to achieve this is to periodically "clean" production databases with automated verifiers.