Systems Health Management
Systems Health Management plays a vital role ensuring the cost effective and safe operation of aerospace systems. It includes techniques for the detection, diagnosis and prognosis of degradation or faults in a vehicle system or its component. Additionally, it provides a recommended response to the degradation or fault such as a recommended maintenance action (off-line) or reconfiguration for fault mitigation (on-line).
The broad benefits of systems health management include:
- Affordability (Reduced Life Cycle Cost)
- Preservation of mission goals
At NASA Glenn, the Systems Health Management sub-discipline has two primary technology focus areas.
- Aircraft Engine Gas Path Health Management
- Space Propulsion Systems Health Management
Aircraft Engine Gas Path Health Management
Gas path health management is a cornerstone capability for monitoring the health of aircraft gas turbine engines. Its founding principles are based upon the parameter interrelationships inherent within a gas turbine engine cycle. Through the analysis of engine sensor measurements collected over time, gas path health management enables the estimation and trending of performance deterioration occurring within the major modules of the engine as well as the diagnosis of system faults.
Key technologies developed to support Gas path health management include the following.
- Propulsion Diagnostic Method Evaluation Strategy (ProDiMES)
- Optimal Tuner Selection for Self-Tuning Engine Models
- Integrated Architecture for Aircraft Engine Performance Monitoring and Fault Diagnostics
- Information Fusion
- Impact of Environmental Particulate Ingestion on Aircraft Engine Performance
Propulsion Diagnostic Method Evaluation Strategy (ProDiMES)
A standard benchmarking problem and evaluation metrics to enable the comparison of candidate aircraft engine gas path diagnostic methods
Many of the propulsion gas path diagnostic method solutions published in the open literature are applied to different platforms, with different levels of complexity, addressing different problems, and using different metrics for evaluating performance. As such, it is difficult to perform a one-to-one comparison of candidate approaches. Furthermore, these inconsistencies create barriers to effective development of new algorithms and the exchange of results.
To help address these issues, the Propulsion Diagnostic Method Evaluation Strategy (ProDiMES) software tool has been specifically designed with the intent to be made publicly available. In this form it can serve as a reference, or theme problem, to aid in propulsion gas path diagnostic technology development and evaluation.
The overall goal is to provide a tool that will serve as an industry standard and will truly facilitate the development and evaluation of significant Engine Health Management (EHM) capabilities. ProDiMES has been developed under a collaborative project of The Technical Cooperation Program (TTCP) based on feedback provided by individuals within the aircraft engine health management community.
ProDiMES Benchmarking Process ProDiMES Benchmarking Process
The ProDiMES tool is coded in MATLAB (The Mathworks, Inc.), and consists of the following functions:
- Engine Fleet Simulator (EFS): Emulates the collection of data at takeoff and cruise from a fleet of engines over their lifetime of use.
- Diagnostic Methods: User-provided and designed to process the simulated parameter histories produced by the EFS and generate a diagnostic assessment for each engine each flight.
- Metrics: Software program automatically evaluates and archives performance of diagnostic solutions against established metrics.
- Blind Test Case Data: To enable the side-by-side comparison of diagnostic solutions developed by multiple users. The target false positive rate for these diagnostic solutions is less than one false positive per 1000 flights
Requesting Access to ProDiMES
ProDiMES is available through the NASA Software Repository
Optimal Tuner Selection for Self-Tuning Engine Models
An emerging approach within the aircraft engine community is the inclusion of adaptive on-board engine models embedded within engine control computer. These models typically include a Kalman filter-based tracking filter that tunes the model to match the physical engine performance based on available sensor measurements. The benefits of self-tuning on-board engine models includes:
- Continuous, real-time engine condition monitoring
- Estimation of unmeasured engine parameters that can be used for controls purposes
- Diagnostics, Prognostics, and Controls applicability
The aircraft engine performance estimation problem poses an underdetermined estimation problem where there are more unknowns than available sensor measurements.
To address the underdetermined estimation problem NASA has developed an “optimal tuner” selection methodology that has been shown to significantly improve on-board engine performance estimation accuracy in the presence of turbomachinery deterioration. This methodology constructs an optimal tuning parameter vector that is:
- Reflective of the effects of turbomachinery performance deterioration
- Of appropriate dimension for application within a Kalman filter
- Selected to minimize Kalman filter mean square error
Integrated Architecture for Aircraft Engine Performance Monitoring and Fault Diagnostics
Conventional aircraft gas turbine engine gas path health management approaches:
- Processes “snapshot” measurements post-flight
- Enables estimation and trending of engine performance and gas path fault diagnostics
- Early diagnosis of incipient fault conditions with minimal latency can be challenging
Emerging Diagnostics Approach
- Advances in on-board processing and flight data recording capabilities are enabling new diagnostic approaches
- Acquisition of full-flight streaming/continuous measurement data is now possible
- Requires new approaches to analyze expanded quantity and format of data
NASA’s Integrated Architecture for Aircraft Engine Performance Monitoring and Fault Diagnostics
Provides an architecture that analyzes streaming measurement data and performs combined performance trend monitoring and gas path diagnostics.
- Real-time self-tuning model produces engine performance parameter estimates that can be used for controls purposes
- Performance baseline model provides a baseline of recent past engine performance that can be referenced for fault diagnostic purposes
- Consistent with “Digital Twin” philosophy
Information fusion approaches leverage information available from multiple sources to yield improved accuracy and confidence in engine health management inferences. NASA has worked to develop and apply information fusion approaches in past partnerships with Pratt & Whitney and Honeywell.
- Modular hierarchical architectures, which accommodate the inclusion of multiple information sources
- Data alignment strategies to transform disparate data sources to a common format and sample rate
- Bayesian inference strategies to incorporate event occurrence and observation probabilities and to provide confidence levels in diagnostic inferences
- Health information from various subsystems (e.g., gas path, vibration, lubrication)
- Recent maintenance actions
- Opposite engine health information
- Control information—fault codes, limit activation
- Fleet-wide engine statistics
- Domain expert knowledge / heuristics
- Negative information (the absence of information can be significant)
Impact of Environmental Particulate Ingestion on Aircraft Engine Performance
A number of in-flight aircraft engine power loss events have occurred due to the ingestion of ice crystals or volcanic ash.
- Engine icing occurs at when ice crystals enter the engine’s core, accrete and grow, which can cause a loss of thrust, engine stall, surge and potential damage to turbomachinery due to ice shedding
- Aircraft flying through volcanic ash clouds exposes the engine to ash ingestion, which is highly erosive to engine components and can also melt and form glass on hot section components restricting air flow
The NASA Glenn Intelligent Control and Autonomy Branch has partnered in the modeling and analysis of engine system level performance effects caused by engine icing and volcanic ash ingestion.
Space Propulsion Health Management
The primary objective of Space Propulsion Health Management (SPHM) is to provide vehicle propulsion systems with the capabilities to preserve the vehicle’s ability to achieve mission goals, which is to ensure safe operation by:
- Protecting the Crew
- Protecting the Mission
- Protecting the Vehicle
SPHM spans the spectrum of a vehicle’s lifecycle from development to operations, and may include:
- Design-time prevention of failures through design margins and quality assurance
- Operational prevention and mitigation (Fault Management)
GRC has more than 30 years of experience developing, implementing, and deploying system health management technologies for launch vehicle flight and ground systems. These technologies provide a broad scope of capabilities, which includes
- Flight Computer Algorithm Development
- Sensor Data Qualification and Consolidation
- Functional Fault Modeling, and
- Analyses to support the design and verification of SPHM algorithms for flight and ground applications.
Application of these capabilities helps to ensure that NASA systems operate safely, reliably, and with greater availability, in order to provide mission success.
Flight Computer Algorithm Development
GRC-developed Space Propulsion Health Management (SPHM) algorithms for flight computer applications provide the following mission and fault management (M&FM) capabilities for space vehicles:
- Control of nominal vehicle operation during pre-launch and inflight
- Perform fault management operation to detect, identify, and isolate off-nominal conditions (vehicle faults) and provide the following predetermined operational responses:
- Redundancy Management (RM) to maintain flight critical functions
- Caution and Warning (C&W) to provide crew and ground with situational awareness of vehicle conditions related to key failures
- Safing Actions:
- Pre-launch – prevent launching the vehicle with failures that could propagate to a loss of mission and a need to abort
- Inflight – prevent failures from propagating to an uncontained failure and loss of vehicle
Sensor Data Qualification and Consolidation
Avionics hardware redundancies provide fault tolerant flight-critical sensor measurements to assure sufficient functionality in the event of component failure. These measurements may be characterized as follows.
- Redundant sensors measure the same physical property
- Redundant avionics boxes process and digitize sensor signals
- Redundant data busses transmit each sensor measurement on multiple data paths
Sensor Data Qualification and Consolidation (SDQC) algorithms are implemented to:
- Identify and flag anomalous sensor data so that it does not negatively influence higher-level vehicle control and decision algorithms.
- Reduce the redundant sensor data for each given measurement to a single value for use by higher-level vehicle control and decision algorithms.
The design, implementation, and verification of SDQC algorithms has a number of significant challenges.
- Nominal data from redundant sensors may differ for a number of reasons (e.g., noise, bias, drift)
- Faults can originate from anywhere in the signal path from the sensor to the flight computer. As a result, the failure signatures can be difficult to predict.
- Analyses to show that the false positive and false negative qualification rates are far below the failure rates associated with the sensor data hardware and software.
GRC has developed Sensor Data Qualification and Consolidation (SDQC) algorithms for the Space Launch System (SLS) flight computers. These are real-time algorithms that process flight-critical sensor data prior to use of the data for onboard vehicle control and decision-making. The algorithms do this by:
- Monitoring and evaluating sensor data to detect the occurrence of faults and anomalies at any point along the data transmission and processing path;
- Providing higher-level flight computer functions with Data Quality Indicators (DQI) that capture analysis results for each sensor;
- Systematically reducing redundant sensor data streams that arise from hardware redundancies built into vehicle avionics.
Functional Fault Models
Functional Fault Models (FFM) are directed graph-based models designed to provide a qualitative representation of failure effect propagation paths within a given system architecture. Failures can be propagated from their source to the sensors available to detect the failure effects. Unique features of these models include the following.
- These models can span the spectrum of abstraction from qualitative to detailed quantitative simulations.
- Early in the design process, quantitative data needed to support failure detection assessment are not typically available or have significant uncertainty. Qualitative FFMs can support early failure detection designs with detection and isolation metrics and can evolve in detail and complexity as the monitored system and the monitoring processes mature.
- These models can also be used to support design assessment and can evolve to provide operational support with real-time or off-line diagnostics and fault isolation.
The figure below provides an example of a directed graph model. Failure mode effects FM1, FM2, and FM3 in the electrical component block on the left are transformed as they propagate through various components to the sensor element block on the right. Along the propagation path, tests TP1, TP2, and TP3 are used to detect the failure mode effects. The completed graph can be followed in reverse to identify the failure mode or modes from a specific set of tests.
- Model development can be tedious and time consuming. Automated development/update tools and techniques need to be established to import data from design, operation, and failure documentation.
- The development of an ontology and modeling practices can be a challenge due to the multidisciplinary nature of FFMs. Well-defined, consistently-used ontologies and modeling practices are essential to facilitate model integration and reuse.
- Scalability issues for large systems and long operational timelines can impact model development, model verification and validation, as well as diagnostic processing. Large models can contain a significant amount of qualitative information and development and testing environments need to be developed that enable efficient model maintenance and support for regression and verification/validation testing.
FFMs are developed using the following information:
- Design documents (schematics and design descriptions)
- Nominal and off-nominal operational timelines
- Failure Modes and Effects Analysis (FMEA)
Recent NASA projects have developed qualitative FFMs using a commercial modeling tool that includes both a graphical user interface to create FFMs and built-in analysis processing to perform diagnostic assessments (i.e. detection coverage and failure isolation). For the NASA Space Launch System (SLS) Mission and Fault Management (M&FM) project, initial FFMs were developed at the subsystem-level and then integrated into the larger vehicle-level FFM. A recent version of this FFM, SLS Integrated Vehicle Failure Model (IVFM), contained over 40,000 failure modes.
Applications & Tools
FFMs have been used to support the analysis of failure management systems, verification of failure detection coverage, and online or off-line failure detection and classification.
More recent NASA FFM applications/demonstrations:
- SLS M&FM IVFM – Subsystem model (Avionics; Main Propulsion System, RS25 engines; Booster Solid Rocket Motors and supporting systems; Core Stage Thrust Vector Control; Flight Termination System; Electrical Power System
- Command, Control, Communications & Range (C3R) Project Advanced Ground Systems Maintenance (AGSM) Element – Ground Based demonstration FFM of the Universal Propellant Servicing System (UPSS)
FFM Support Tools
The following tools are available to support the development and use of FFMs.
- Extended Testability Analysis (ETA) Tool [LEW-19241] – Extends/enhances analysis reporting from FFM commercial development package. (U.S. release only through software.nasa.gov)
- GENeric Model INstantIator (GEMINI) Tool [LEW-19329] – Interface architecture for instantiation of Generic FFM models. (NASA release only)
- Verification Analysis (VERA) Tool [LEW-19312] – FFM compliance verification. (NASA release only)
- SWsetter – Efficient modification of system mode configuration across large FFMs. (NASA release only)
- Test Description File (TDF) Creator – Qualitative FFM interface development tool for real-time data processing. (NASA release only)
Goal Tree/Success Tree
Goal Tree/Success Tree (GT/ST) is a functional decomposition framework for modeling complex physical systems.
- GT/ST extends the classical function decomposition, breaking down top-level system goals into sub-goals and functional requirements and identifying state variables to monitor along the tree.
- GT/STs can be used to assess the monitoring coverage of the top-level goals, determining coverage of loss of critical function, rather than coverage of critical failures which could impact multiple functions.
- GT/ST provides traceability between system goals or requirements and low level functions required to achieve those goals.
- Establishing a consistent ontology and model representation that supports the functional decomposition logic and transitions natural-language information into model-based format.
- Converting the GT/ST model into a dynamic knowledge base for real-time operator assistance.
- Establish a limited set of top-level goals, such as protect the crew and achieve trans-lunar orbit.
- Determine for each goal or sub-goal the complete set of functions that are required to achieve those goals. Continue this process until the functions at a level of no longer being monitored.
- At the function levels define any state variables that would define functional success and the success ranges for those variables.
- Identify system redundancies and properly capture feedback loops within the tree.
- Finally if the system allows off-nominal operation, identify transition points within the tree where not achieving a function or goal would transition to the system to an alternate state. This could be an abort of operation or an alternate operating state to achieve a new set of goals. These transitions points could ultimately be portals additional GT/STs.