Mission Critical Systems: Mars Climate Orbiter
Navigating the Mistakes of Mars: A Deep Dive into the MCO's Errors and Lessons for Future Exploration
Background:
The Mars Climate Orbiter was one out of 3 probes sent to Mars for research by NASA’s Mars Surveyor program (NASA, 2019). The Mars Climate Orbiter was tasked with determining the weather, temperature, and atmospheric conditions of the planet (Wikipedia Contributors, 2018). The goal of the Mars Climate Orbiter was to work in tandem with the other space probes of the Mars Surveyor program to map the planet's surface and use atmospheric and hydrological data to determine if Mars could potentially sustain or has previously sustained life. The Mars Climate Orbiter was launched on December 11, 1998, and took around nine and a half months to reach its destination (ThinkReliability, n.d.).
Actual Incident / Disaster:
On September 23rd, 1999, the Mars Climate Orbiter reached the planet Mars and attempted to descend into the orbit of the planet slowly. The Mars Climate Orbiter descended 49 seconds earlier than what was expected by NASA and subsequently, NASA lost all communication with the probe after that point (Wikipedia Contributors, 2018). NASA continuously attempted to communicate with the space probe until September 25, 1999, with no success (Mars Climate Orbiter Mishap Investigation Board, 1999). The space probe is believed to have descended into the Martian atmosphere at a rate that was not sustainable and was permanently destroyed as a result (NASA, 2019).
Root Cause:
Following the disaster, NASA’s Mishap Investigation Board (MIB) held multiple meetings and briefings with the Lockheed Martin Astronautics team (who supplied software for the MCO) to determine the root cause of the MCO’s failure. It was determined that a file named “Small Forces” was coded to use English units for data instead of the metric system. This caused the output from the “Small Forces” file to be represented in English units as well, which caused errors in the trajectory of the Mars Climate Orbiter (Mars Climate Orbiter Mishap Investigation Board, 1999).
Possible Prevention:
Despite the complexity of the disaster and the technology behind the MCO, this mishap could have been more easily discovered and prevented if NASA had taken more time with the integration of all the distinct components that were behind the MCO and the Mars Surveyor Program (Jet Propulsion Laboratory, 1999).
Possible Cause - Requirements:
Software requirements are an immensely important component in the development cycle of any piece of software as it specifically states and defines the purposes of the system, its constraints and conditions. When programming software, the requirements must be continually referred to, and the software produced must conform to every requirement stated by the SRS. Unfortunately, it was determined that a failure to follow the requirements outlined in the Software Interface Specification (SIS) led to the “Small Forces” file being coded to output English units instead of the required metric output (Mars Climate Orbiter Mishap Investigation Board, 1999). This was not acceptable, especially for an agency such as NASA flying millions of dollars of equipment to another planet for the sake of humanity.
Possible Cause - Technology:
It isn't possible to find documented proof of the project being rushed. Despite that, it is easy to draw the conclusion that either the engineers at Lockheed Martin (who made the faulty software) or NASA did not take the necessary time to check and test the software against the requirements. This is because NASA was easily able to determine the cause of the mishap after the disaster occurred by briefing the members at Lockheed Martin. If these two entities had taken the time to review and double-check the integration of everything then this accident could have been deterred.
Possible Cause - Testing:
The software for the Mars Climate Orbiter could be considered a Mission Critical System. What that means is aside from the regular functional testing (which easily should have been able to prevent this disaster), fault-based testing, boundary testing and interface testing it should have received, the “Small Forces” file should have also undergone additional testing such as proof by contradiction and run-time safety checking. It is unreasonable to believe that the functional testing would not have caught the discrepancy between the units of measurement that the software outputted and the units of measurement required by NASA’s SIS.
Pre-Warning:
From the launch of the MCO to April of 1999, the “Small Forces” file was unable to be used due to multiple errors with the file’s format, so Lockheed Martin took that time to fix the file so that it could be later used. After they had “fixed” and re-implemented the file, it started to output unreliable data (Mars Climate Orbiter Mishap Investigation Board, 1999). Between the dates of September 15, 1999 and September 23, 1999, the MCO had multiple errors leading to the space probe being at an average of 60 KM lower altitude than it was programmed for (Wikipedia Contributors, 2018). Allegedly, NASA was unable to monitor the files direct effect on the MCO’s thruster activity due to the thruster being perpendicular to the Earth-to-spacecraft line of sight, which caused observability issues on their end so the error remained (Mars Climate Orbiter Mishap Investigation Board, 1999).
Software and Electronics Role:
As discussed in the Root Causes sub-section, the error of the developers at Lockheed Martin led to the Mars Climate Orbiter’s disaster. The developers failed to follow the requirements for the software and adequately test it, which led to the file outputting English units instead of the required metric units to model the MCO’s trajectory.
References
Jet Propulsion Laboratory. (1999, September 30). Mars Climate Orbiter Team Finds Likely Cause of Loss. NASA Jet Propulsion Laboratory (JPL). https://www.jpl.nasa.gov/news/mars-climate-orbiter-team-finds-likely-cause-of-loss
Mars Climate Orbiter Mishap Investigation Board. (1999). Mars Climate Orbiter Mishap Investigation Board Phase I Report. https://llis.nasa.gov/llis_lib/pdf/1009464main1_0641-mr.pdf
NASA. (2019, July 25). In depth | Mars climate orbiter. NASA Solar System Exploration. https://solarsystem.nasa.gov/missions/mars-climate-orbiter/in-depth/
ThinkReliability. (n.d.). Mars Orbiter | ThinkReliability, Root Cause Analysis Case Studies. ThinkReliability. Retrieved November 23, 2022, from https://www.thinkreliability.com/case_studies/root-cause-analysis-the-loss-of-the-mars-climate-orbiter/
Wikipedia Contributors. (2018, December 17). Mars Climate Orbiter. Wikipedia; Wikimedia Foundation. https://en.wikipedia.org/wiki/Mars_Climate_Orbiter