HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide First Edition (June 2001) Part Number: EK-G80TR-SA. A01 Compaq Computer Corporation 2001 Compaq Computer Corporation. Com paq, the Compaq logo, and StorageWorks Registered in U. S. Patent and Trademark Office. OpenVMS is a trademark of Compaq Information Technologies Group, L.P. in the United States and other countries. Intel is a trademark of Intel Corporation in the United States and other countries. UNIX is a trademark of The Open Group in the United States and other countries. All other product names mentioned herein may be trademarks of their respective companies. Confidential computer software. Valid license from Compaq required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Com paq shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is provided "as is" without warranty of any kind and is subject to change without notice. The warranties for Compaq products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. Com paq service tool software, including associated documentation, is the property of and contains confidential technology of Compaq Computer Corporation. Service customer is hereby licensed to use the software only for activities directly relating to the delivery of, and only during the term of, the applicable services delivered by Compaq or its authorized service provider. Customer may not modify or reverse engineer, remove, or transfer the software or make the software or any resultant diagnosis or system management data available to other parties without Compaq's or its service provider's consent. Upon termination of the services, customer will, at Compaq's or its service provider's option, destroy or return the software and associated documentation in its possession. Printed in the U.S.A. HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide First Edition (June 2001) Part Number: EK-G80TR-SA. A01 Contents About This Guide Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Symbols in Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Symbols on Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Rack Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Compaq Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Compaq Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Compaq Authorized Reseller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Chapter 1 Troubleshooting Information Typical Installation Troubleshooting Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Troubleshooting Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Significant Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Reporting Events That Cause Controller Operation to Halt. . . . . . . . . . . . . . . . . . . . . . . . 110 Flashing OCP Pattern Display Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Solid OCP Pattern Display Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Last Failure Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Reporting Events That Allow Controller Operation to Continue. . . . . . . . . . . . . . . . . . . . 118 Spontaneous Event Log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 CLI Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Running the Controller Diagnostic Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 ECB Charging Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Battery Hysteresis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Caching Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Read Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 iv HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Read-Ahead Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Write-Through Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Write-Back Caching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Fault-Tolerance for Write-Back Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Nonvolatile Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Cache Policies Resulting from Cache Module Failures . . . . . . . . . . . . . . . . . . . . . . . . 123 Enabling Mirrored Write-Back Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Chapter 2 Utilities and Exercisers Fault Management Utility (FMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Displaying Failure Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Translating Event Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Controlling the Display of Significant Events and Failures . . . . . . . . . . . . . . . . . . . . . . . . . 25 Video Terminal Display (VTDPY) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Restrictions with VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Running VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 VTDPY Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 VTDPY Display Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Default Screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Controller Status Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Cache Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Device Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Host Ports Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Resource Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Remote Status Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Interpreting VTDPY Screen Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Screen Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Common Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Unit Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Device Performance Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Device Port Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Host Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 TACHYON Chip Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Runtime Status of Remote Copy Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Device Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 Controller/Processor Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Resource Performance Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Disk Inline Exerciser (DILX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Checking for Unit Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Finding a Unit in the Subsystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 v Testing the Read Capability of a Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Testing the Read and Write Capabilities of a Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Format and Device Code Load Utility (HSUTIL). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Configuration (CONFIG) Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Code Load and Code Patch (CLCP) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Clone (CLONE) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Field Replacement Utility (FRUTIL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Change Volume Serial Number (CHVSN) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Chapter 3 Event Reporting Templates Passthrough Device Reset Event Sense Data Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Last Failure Event Sense Data Response (Template 01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Multiple-Bus Failover Event Sense Data Response (Template 04) . . . . . . . . . . . . . . . . . . . . . . 35 Failover Event Sense Data Response (Template 05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Nonvolatile Parameter Memory Component Event Sense Data Response (Template 11) . . . . . 38 Backup Battery Failure Event Sense Data Response (Template 12) . . . . . . . . . . . . . . . . . . . . . 310 Subsystem Built-In Self Test Failure Event Sense Data Response (Template 13) . . . . . . . . . . 311 Memory System Failure Event Sense Data Response (Template 14) . . . . . . . . . . . . . . . . . . . . 313 Device Services Nontransfer Error Event Sense Data Response (Template 41). . . . . . . . . . . . 315 Disk Transfer Error Event Sense Data Response (Template 51) . . . . . . . . . . . . . . . . . . . . . . . 317 Data Replication Manager Services Event Sense Response (Template 90) . . . . . . . . . . . . . . . 319 Chapter 4 ASC/ASCQ, Repair Action, and Component Identifier Codes Vendor Specific SCSI ASC/ASCQ Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Component ID Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Chapter 5 Instance Codes Instance Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Instance Codes and FMU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Notification/Recovery Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Event Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Component ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 vi HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Chapter 6 Last Failure Codes Last Failure Code Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Last Failure Codes and FMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Parameter Count. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Restart Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Hardware/Software Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Error Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Component ID Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Glossary Index vii Figures Figure 21. VTDPY commands and shortcuts generated from the Help command. . . . . . . . . . . 210 Figure 22. Sample of the VTDPY default screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Figure 23. Sample of the VTDPY status screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Figure 24. Sample of the VTDPY cache screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Figure 25. Sample of regions on the VTDPY device screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Figure 26. Sample of the VTDPY host screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Figure 27. Sample of the VTDPY resource screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Figure 28. Sample of the VTDPY remote status screen (ACS version 8.6P only) . . . . . . . . . . . 218 Figure 51. Structure of an Instance Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Figure 61. Structure of a Last Failure Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 viii HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Tables Table 11 Troubleshooting Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Table 12 Flashing OCP Pattern Displays and Repair Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Table 13 Solid OCP Pattern Displays and Repair Actions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Table 14 ECB Capacity Based On Memory Size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Table 15 Cache Policies--Cache Module Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Table 16 Resulting Cache Policies--ECB Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Table 21 Event Code Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Table 22 FMU SET Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Table 23 VTDPY Key Sequences and Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Table 24 VTDPY--Common Data Fields Column Definitions: Part 1 . . . . . . . . . . . . . . . . . . . 220 Table 25 VTDPY--Common Data Fields Column Definitions: Part 2 . . . . . . . . . . . . . . . . . . . 221 Table 26 VTDPY--Unit Performance Data Fields Column Definitions . . . . . . . . . . . . . . . . . . 222 Table 27 VTDPY--Device Performance Data Fields Column Definitions. . . . . . . . . . . . . . . . 224 Table 28 VTDPY--Device Port Performance Data Fields Column Definitions . . . . . . . . . . . . 225 Table 29 Fibre Channel Host Status Display--Known Host Connections . . . . . . . . . . . . . . . . 226 Table 210 Fibre Channel Host Status Display--Port Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Table 211 Fibre Channel Host Status Display--Link Error Counters . . . . . . . . . . . . . . . . . . . . . 227 Table 212 First Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Table 213 Second Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Table 214 Remote Display Column Definitions-- ACS Version 8.6P Only. . . . . . . . . . . . . . . . 229 Table 215 Device Map Column Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Table 216 Controller/Processor Utilization Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Table 217 VTDPY Thread Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Table 218 Resource Performance Statistics Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Table 219 DILX Control Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Table 220 Data Patterns for Phase 1: Write Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Table 221 DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 ix Table 222 HSUTIL Messages and Inquiries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Table 31 Passthrough Device Reset Event Sense Data Response Format . . . . . . . . . . . . . . . . . 32 Table 32 Template 01--Last Failure Event Sense Data Response Format . . . . . . . . . . . . . . . . 34 Table 33 Template 04--Multiple-Bus Failover Event Sense Data Response Format . . . . . . . . 35 Table 34 Template 05--Failover Event Sense Data Response Format . . . . . . . . . . . . . . . . . . . 37 Table 35 Template 11--Nonvolatile Parameter Memory Component Event Sense Data Response Format 39 Table 36 Template 12--Backup Battery Failure Event Sense Data Response Format . . . . . . 310 Table 37 Template 13--Subsystem Built-In Self Test Failure Event Sense Data Response Format 311 Table 38 Template 14--Memory System Failure Event Sense Data Response Format . . . . . 313 Table 39 Template 41--Device Services Non-Transfer Error Event Sense Data Response Format 316 Table 310 Template 51--Disk Transfer Error Event Sense Data Response Format . . . . . . . . . 318 Table 311 Template 90--Data Replication Manager Services Event Sense Data Response Format (ACS Version 8.6P Only) 320 Table 41 ASC and ASCQ Code Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Table 42 Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 43 Component ID Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Table 51 Instance Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Table 52 Event Notification/Recovery (NR) Threshold Classifications . . . . . . . . . . . . . . . . . . 52 Table 53 Instance Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Table 61 Last Failure Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table 62 Controller Restart Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table 63 Last Failure Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 About This Guide This guide is a troubleshooting resource for HSG80 array controllers running array controller software (ACS) versions 8.6F, 8.6G, 8.6P, and 8.6S. It contains information on various utilities, software templates, and event reporting codes. Text Conventions This document uses the following conventions to distinguish elements of text: Keys Keys appear in boldface. A plus sign (+) between two keys indicates that they should be pressed simultaneously. USER INPUT User input appears in a different typeface and in uppercase FILENAMES File names appear in uppercase italics. Menu Options, These elements appear in initial capital letters. Command Names, Dialog Box Names C OMMANDS, These elements appear in upper case. DIRECTORY NAMES, NOTE: UNIX commands are case sensitive and will not and DRIVE NAMES appear in uppercase. Type When you are instructed to type information, type the information without pressing the Enter key. Enter When you are instructed to enter information, type the information and then press the Enter key. xii HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide "this controller" The controller serving the current CLI session through a local or remote terminal. "other controller" The controller in a dual-redundant pair that is connected to the controller serving the current CLI session. Symbols in Text These symbols may be found in the text of this guide. They have the following meanings. WARNING: Text set off in this manner indicates that failure to follow directions in the warning could result in bodily harm or loss of life. CAUTION: Text set off in this manner indicates that failure to follow directions could result in damage to equipment or loss of information. IMPORTANT: Text set off in this manner presents clarifying information or specific instructions. NOTE: Text set off in this manner presents commentary, sidelights, or interesting points of information. Symbols on Equipment These icons may be located on equipment in areas where hazardous conditions may exist. Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. The enclosed area contains no operator serviceable parts. WARNING: To reduce the risk of injury from electrical shock hazards, do not open this enclosure. About This Guide xiii Any RJ-45 receptacle marked with these symbols indicates a Network Interface Connection. WARNING: To reduce the risk of electrical shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle. Any surface or area of the equipment marked with these symbols indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching. Power Supplies or Systems marked with these symbols indicate the equipment is supplied by multiple sources of power. WARNING: To reduce the risk of injury from electrical shock, remove all power cords to completely disconnect power from the system. Any product or assembly marked with these symbols indicates that the component exceeds the recommended weight for one individual to handle safely. WARNING: To reduce the risk of personal injury or damage to the equipment, observe local occupational health and safety requirements and guidelines for manual material handling. xiv HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Rack Stability WARNING: To reduce the risk of personal injury or damage to the equipment, be sure that: s The leveling jacks are extended to the floor. s The full weight of the rack rests on the leveling jacks. s The stabilizing feet are attached to the rack if it is a single rack installation. s The racks are coupled together in multiple rack installations. s A rack may become unstable if more than one component is extended for any reason. Extend only one component at a time. Getting Help If you have a problem and have exhausted the information in this guide, you can get further information and other help in the locations listed in this section. Compaq Technical Support You are entitled to free hardware technical telephone support for your product for as long you own the product. A technical support specialist will help diagnose the problem or guide you to the next step in the warranty process. In North America, call the Compaq Technical Phone Support Center at 1-800-OK-COMPAQ. This service is available 24 hours a day, 7 days a week. NOTE: For continuous quality improvement, calls may be recorded or monitored. Outside North America, call the nearest Compaq Technical Support Phone Center. Telephone numbers for world wide Technical Support Centers are listed on the Compaq website. Access the Compaq website by logging on to the Internet at http://www.compaq.com. Be sure to have the following information available before you call Compaq: s Technical support registration number (if applicable) s Product serial numbers s Product model names and numbers s Applicable error messages About This Guide xv s Add-on boards or hardware s Third-party hardware or software s Operating system type and revision level s Detailed, specific questions Compaq Website The Compaq website has latest information on this product as well as the latest drivers. You can access the Compaq website by logging on to the Internet at http://www.compaq.com/storage. Compaq Authorized Reseller For the name of your nearest Compaq Authorized Reseller: s In the United States, call 1-800-345-1518. s In Canada, call 1-800-263-5868. s Elsew here, see the Compaq website for locations and telephone numbers. 1 Chapter Troubleshooting Information This chapter provides guidelines for troubleshooting the controller, cache module, and external cache battery (ECB). See enclosure documentation for information on troubleshooting enclosure hardware, such as the power supplies, cooling fans, and environmental monitoring unit (EMU). 12 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Typical Installation Troubleshooting Checklist The following checklist identifies many of the problems that occur in a typical installation. After identifying a problem, use Table 11 to confirm the diagnosis and fix the problem. If an initial diagnosis points to several possible causes, use the tools described in this chapter and then those in Chapter 2 to further refine the diagnosis. If a problem cannot be diagnosed using the checklist and tools, contact a Compaq authorized service provider for additional support. To troubleshoot the controller and supporting modules: 1. Check the power to the enclosure and enclosure components. Are power cords connected properly? Is power within specifications? 2. Check the component cables. Are bus cables to the controllers connected properly? For BA370 enclosures, are ECB cables connected properly? 3. Check each program card to make sure the card is fully seated. 4. Check the operator control panel (OCP) and devices for LED codes. See "Flashing OCP Pattern Display Reporting" on page 111, and "Solid OCP Pattern Display Reporting" on page 113, to interpret the LED codes. 5. Connect a local terminal to the controller and check the controller configuration with the following command: SH OW THIS_CONTROLLER FULL Make sure that the ACS version loaded is correct and that pertinent patches are installed. Also, check the status of the cache module and the supporting ECB. In a dual redundant configuration, check the "other controller" with the following command: SH OW OTHER_CONTROLLER FULL Troubleshooting Information 13 6. Use the fault management utility (FMU) to check for Last Failure or "memory system failure" entries. Show these codes and translate the Last Failure Codes they contain. See Chapter 2, "Displaying Failure Entries" and "Translating Event Codes" sections. If the controller failed to the extent that the controller cannot support a local terminal for FMU, check the host error log for the Instance or Last Failure Codes. See Chapter 5 and Chapter 6 to interpret the event codes. 7. Check device status with the following command: SHOW DEVICES FULL Look for errors such as "misconfigured device" or "No device at this PTL." If a device reports misconfigured or missing, check the device status with the following command: SHOW device-name 8. Check storageset status with the following command: SHOW STORAGESETS FULL Make sure that all storagesets are normal (or normalizing if the storageset is a RAIDset or mirrorset). Check again for misconfigured or missing devices using step 7. 9. Check unit status with the following command: SHOW UNITS FULL Make sure that all units are available or online. If the controller reports a unit as unavailable or offline, recheck the storageset the unit belongs to with the following command: SHOW storageset-name If the controller reports that a unit has lost data or is unwriteable, recheck the status of the devices that make up the storageset. If the devices are operating normally, recheck the status of the cache module. If the unit reports a media format error, recheck the status of the storageset and storageset devices. 14 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Troubleshooting Table After diagnosing a problem, use Table 11 to resolve the problem. Table 11 Troubleshooting Guidelines (Sheet 1 of 6) Symptom Possible Cause Investigation Remedy Reset button not lit. No power to subsystem. Check power to subsystem Replace cord or (BA370 and power supplies on enclosure only) AC input box. controller enclosure. BA370 enclosure only: Turn off power switch on AC Make sure that all cooling input box. Replace cooling fans are installed. If one or fan. Restore power to more fans are missing or all subsystem. are inoperative for more than 8 minutes, the EMU shuts down the subsystem. BA370 enclosure only: Press the alarm control Determine if the standby switch on the EMU. power switch on the PVA was pressed for more than 5 seconds. Failed controller. If the previous remedies fail Replace controller. to resolve the problem, check OCP LED codes. Reset button lit steadily; Various. See OCP LED Codes. Follow repair action using other LEDs also lit. Table 12. Device in error or failedset SHOW device FULL. Follow repair action using Reset button FLASHING; other LEDs also lit. on corresponding device Table 13. port with other LEDs lit. Troubleshooting Information 15 Table 11 Troubleshooting Guidelines (Sheet 2 of 6) Symptom Possible Cause Investigation Remedy Use the correct command Incorrect command See the controller CLI Cannot set failover to syntax. syntax. reference guide for the SET create dual-redundant FAILOVER command. configuration. Different software versions Check software versions on Update one or both on controllers. both controllers. controllers so that both are using the same software version. Incompatible hardware. Check hardware versions. Upgrade controllers so that they are using compatible hardware. Controller previously set Make sure that neither Use the SET NOFAILOVER for failover. controller is configured for command on both failover. controllers, then reset "this controller" for failover. Follow repair action using Failed controller. If the previous remedies fail Table 12 or Table 13. to resolve the problem, check for OCP LED codes. Node ID is all zeros. SHOW_THIS to see if node Set node ID using the node ID is all zeros. ID (bar code) that is located on the frame in which the controller sits. See SET THIS_CONTROLLER NODE_ID in the controller CLI reference guide. Also, be sure to copy in the right direction. If cabled to the new controller, use SET FAILOVER COPY= OTHER_CONTROLLER. If cabled to old controller, use SET FAILOVER COPY=THIS_CONTROLLER. 16 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Table 11 Troubleshooting Guidelines (Sheet 3 of 6) Symptom Possible Cause Investigation Remedy Reseat DIMM. Improperly installed DIMM. Remove cache module and Nonmirrored cache: make sure that the DIMM is controller reports failed fully seated in the slot. DIMM in Cache A or B. Failed DIMM. If the previous remedy fails Replace DIMM. to resolve the problem, check for OCP LED codes. Reseat DIMM. Remove cache module and Improperly installed DIMM Mirrored cache: make sure that DIMMs are in "this controller" cache "this controller" reports installed properly. module. DIMM 1 or 2 failed in Cache A or B. Replace DIMM in "this Failed DIMM in "this If the previous remedy fails controller" cache module. controller" cache module. to resolve the problem, check for OCP LED codes. Mirrored cache: Improperly installed DIMM Remove cache module and Reseat DIMM. "this controller" reports in "other controller" cache make sure that the DIMMs DIMM 3 or 4 failed in module. are installed properly. Cache A or B. Failed DIMM in "other If the previous remedy fails Replace DIMM in "other controller" cache module. to resolve the problem, controller" cache module. check for OCP LED codes. Mirrored cache: controller Mem ory module was BA370 enclosure: ECB BA370 enclosure: Connect reports battery not installed before the cache cable not connected to ECB cable to cache module, present. module was connected to cache module. then restart both controllers an ECB. by pushing their reset Model 2200 enclosure: ECB buttons simultaneously. not installed or seated properly in backplane. Model 2200 enclosure: install or reseat ECB. Enter the SHUTDOWN SHOW THIS_CONTROLLER Primary data and the Mirrored cache: controller command on controllers that indicates that the cache or mirrored copy data are not reports cache or mirrored report the problem. (This mirrored cache has failed. identical. cache has failed. command flushes the cache Spontaneous FMU message contents to synchronize the displays: "Primary cache primary and mirrored data.) declared failed - data Restart the controllers that inconsistent with mirror," or were shut down. "Mirrored cache declared failed - data inconsistent with primary." Troubleshooting Information 17 Table 11 Troubleshooting Guidelines (Sheet 4 of 6) Symptom Possible Cause Investigation Remedy Connect a terminal to the SHOW THIS_CONTROLLER Invalid cache. Mirrored-cache mode maintenance port on the indicates "invalid cache." discrepancy. This controller reporting the error discrepancy might occur and clear the error with the after installing a new Spontaneous FMU message following command--all on controller. The existing displays: "Cache modules one line: CLEAR_ERRORS cache module is set for inconsistent with mirror THIS_CONTROLLER mirrored caching, but the mode." INVALID_CACHE new controller is set for NODESTROY_UNFLUSHED_ unmirrored caching. DATA. See the controller CLI This discrepancy might reference guide for more also occur if the new information. controller is set for mirrored caching, but the existing cache module is not. SHOW THIS_CONTROLLER Connect a terminal to the Cache module might indicates "invalid cache." maintenance port on the erroneously contain controller reporting the error, unflushed write-back data. and clear the error with the This might occur after No spontaneous FMU following command--all on installing a new controller. message. one line: CLEAR_ERRORS The existing cache module THIS_CONTROLLER might indicate that the INVALID_CACHE cache module contains DESTROY_UNFLUSHED_ unflushed write-back data, DATA. See the controller CLI but the new controller reference guide for more expects to find no data in information. the existing cache module. This error might also occur if installing a new cache module for a controller that expects write-back data in the cache. 18 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Table 11 Troubleshooting Guidelines (Sheet 5 of 6) Symptom Possible Cause Investigation Remedy Replace device. Cannot add device. Illegal device. See product-specific release notes that accompanied the software release for the most recent list of supported devices. Device not properly Check that the device is Firmly press the device into installed in enclosure. fully seated. the bay. Failed device. Check for presence of Follow repair action in the device LEDs. documentation provided with the enclosure or device. Failed power supplies. Check for presence of Follow repair action in the power supply LEDs. documentation provided with the enclosure or power supply. Replace enclosure. Failed bus to device. If the previous remedies fail to resolve the problem, check for OCP LED codes. Reconfigure storageset with Cannot configure Incorrect command See the controller CLI correct command syntax. storagesets. syntax. reference guide for the ADD storageset command. Delete unused storagesets. Exceeded maximum Use the SHOW command to number of storagesets. count the number of storagesets configured on the controller. Replace the ECB if required. Use the SHOW command to Failed battery on ECB. An check the ECB battery ECB or uninterruptible status. power supply (UPS) is required for RAIDsets and mirrorsets. Reassign the unit number Cannot assign unit Incorrect command See the controller CLI with the correct syntax. number to storageset. syntax. reference guide for correct syntax. Troubleshooting Information 19 Table 11 Troubleshooting Guidelines (Sheet 6 of 6) Symptom Possible Cause Investigation Remedy None None Unit is available but not This is normal. Units are online. "available" until the host accesses them, at which point their status is changed to "online." Host cannot see device. Broken cables. Check for broken cables. Replace broken cables. Check for the required Configure device special files Host cannot access unit. Host files or device drivers device special files. as described in the not properly installed or installation and configuration configured. guide that accompanied the software release. Invalid Cache See the description for the See the description for the invalid cache symptom on invalid cache symptom. page 17. Units have lost data. Issue the SHOW UNITS FULL Clear these units with: command. CLEAR_ERRORS unit- number LOST_DATA. Rebuild the storageset, then Conduct a read scan of the Unrecoverable read errors Host log file or restore storageset data from storageset using the might have occurred when maintenance terminal a backup source. While the appropriate utility from the the controller was indicates that a forced controller is reconstructing host operating system, such reconstructing the error occurred when the the storageset, monitor the as the "dd" utility for a storageset. Errors occur if controller was host error log activity or TRU64 UNIX host. another member fails reconstructing a RAIDset spontaneous event reports while the controller is or mirrorset. on the maintenance terminal reconstructing the for any unrecoverable errors. storageset. If unrecoverable errors persist, note the device on which they occurred, and replace the device before proceeding. Use the SHOW storageset- Wait for normalizing Host requested data from name command to see if all members to become normal, a normalizing storageset then resume I/O to them. storageset members are that did not contain the data. "normal." 110 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Significant Event Reporting Controller fault management software reports information about significant events that occur. These events are reported by: s Maintenance terminal displays s Host error logs s OCP LEDs Some events cause controller operation to halt; others allow the controller to remain operable. Both types of events are detailed in the following sections. Reporting Events That Cause Controller Operation to Halt Events that cause the controller to halt operations are reported three possible ways: s a FLASHING OCP pattern display s a SOLID OCP pattern display s Last Failure reporting Use Table 12 to interpret FLASHING OCP patterns and Table 13 to interpret SOLID (ON) OCP patterns. In the Error column of the solid OCP patterns, there are two separate descriptions. The first denotes the actual error message that appears on the terminal, and the second provides a more detailed explanation of the designated error. Use the following legend to interpret both tables as indicated: s = reset button FLASHING (in Table 12) or ON (in TABLE 13) = reset button OFF q = LED FLASHING (in Table 12) or ON (in TABLE 13) = LED OFF NOTE: If the reset button is FLASHING and an LED is ON, either the devices on the bus that corresponds to the LED do not match the controller configuration, or an error occurred in one of the devices on that bus. Also, a single LED that is turned ON indicates a failure of the drive on that bus. Troubleshooting Information 1 11 Flashing OCP Pattern Display Reporting Certain events can cause a FLASHING display of the OCP LEDs. Each event and the resulting pattern are described in Table 12. IMPORTANT: Remember that a solid black pattern represents a FLASHING display. A white pattern indicates OFF. All LEDs FLASH at the same time and at the same rate. Table 12 FLASHING OCP Pattern Displays and Repair Actions Pattern OC P Error Repair Action Code sq 1 Program card EDC error. Replace program card. sq 4 Timer zero on the processor is bad. Replace controller. sq q 5 Timer one on the processor is bad. Replace controller. sq q 6 Processor Guarded Memory Unit (GMU) is Replace controller. bad. sq q q B Nonvolatile Journal Memory (JSRAM) Verify the correct upgrade (see the structure is bad because of a memory controller release notes and cover letters, error or an incorrect upgrade procedure. if available). If error continues, replace controller. sq q q Press the reset button to restart the D One or more bits in the diagnostic controller. If this does not correct the registers did not match the expected error, replace the controller. reset value. sq q q E Memory error in the JSRAM. Replace controller. sq q q q F Wrong image found on program card. Replace program card or replace controller if needed. sq 10 Controller Module memory is bad. Replace controller. sq q 12 Controller Module memory addressing is Replace controller. malfunctioning. sq q q 13 Controller Module memory parity is not Replace controller. working. sq q 14 Controller Module memory controller Replace controller. timer has failed. Legend: s = reset button FLASHING = reset button OFF q = LED OFF = LED FLASHING 112 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Table 12 FLASHING OCP Pattern Displays and Repair Actions (Continued) Pa ttern OC P Error Repair Action Code sq q q 15 The Controller Module memory controller Replace controller. interrupt handler has failed. sq q q q 1E During the diagnostic memory test, the Replace controller. Controller Module memory controller caused an unexpected Non-Maskable Interrupt (NMI). sq q 24 The card code image changed when the Replace controller. contents were copied to memory. sq q 30 The JSRAM battery is bad. Replace controller. sq q q 32 First-half diagnostics of the Time of Year Replace controller. Clock failed. sq q q q 33 Second-half diagnostics of the Time of Replace controller. Year Clock failed. sq q q q 35 The processor bus-to-device bus bridge Replace controller. chip is bad. sq q q q q 3B An unnecessary interrupt pending. Replace controller. sq q q q 3C An unexpected fault during initialization. Replace controller. sq q q q q 3D An unexpected maskable interrupt during Replace controller. initialization. sq q q q q 3E An unexpected NMI during initialization. Replace controller. sq q q q q q 3F An invalid process ran during Replace controller. initialization. Legend: s = reset button FLASHING = reset button OFF q = LED OFF = LED FLASHING Troubleshooting Information 1 13 Solid OCP Pattern Display Reporting Certain events cause the OCP LEDs to display ON or SOLID. Each event and the resulting pattern are described in Table 13. Information related to the solid OCP patterns is automatically displayed on the maintenance terminal (unless disabled with the FMU) using %FLL formatting, as detailed in the following examples: %FLL--H SG > --13-MAY-2001 04:39:45 (time not set)-- OCP Code: 38 Controller operation terminated. %FLL--H SG > --13-MAY-2001 04:32:26 (time not set)-- OCP Code: 26 Memory module is missing. Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 1 of 5) Pattern OC P Error Repair Action Code 0 Catastrophic controller or power failure. Check power. If good, reset controller. If problem persists, reseat controller module and reset controller. If problem is still evident, replace controller module. s 0 No program card detected or kill asserted Make sure that the program card is by other controller. properly seated while resetting the controller. If the error persists, try the Controller unable to read program card. card with another controller; or replace the card. Otherwise, replace the controller that reported the error. sq q q 25 Recursive Bugcheck detected. Reset the controller. If this fault pattern is displayed repeatedly, follow the repair The same bugcheck has occurred three actions associated with the Last Failure times within 10 minutes, and controller code that is repeatedly terminating operation has halted. controller execution. sq q q 26 Indicated memory module is missing. Insert memory module (cache board). Controller is unable to detect a particular memory module. Legend: s = reset button ON q = LED OFF = reset button OFF = LED ON 114 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 2 of 5) Pa ttern OC P Error Repair Action Code sq q q q 27 Memory module has insufficient usable Replace indicated DIMMs. memory. This indication is only provided when Fault LED logging is enabled. sq q 28 An unexpected Machine Fault/NMI Reset the controller. occurred during Last Failure processing. A machine fault was detected while a Non-Maskable Interrupt was processing. sq q q 29 EMU protocol version incompatible. Upgrade either the EMU microcode or the software (refer to the release notes that The microcode in the EMU and the accompanied the controller software). software in the controller are not compatible. sq q q 2A All enclosure I/O modules are not of the Make sure that the I/O modules in an same type. extended subsystem are either all single- ended or all differential, not both. Enclosure I/O modules are a combination of single-ended and differential. sq q q q Make sure that enclosure SCSI bus 2B Jumpers, not terminators, found on terminators are installed and that no backplane. jumpers are installed. Replace the failed One or more SCSI bus terminators are terminator if the problem continues. either missing from the backplane or broken. sq q q Make sure that all of the enclosure device 2C Enclosure I/O termination power out of SCSI buses have an I/O module. If range. problem persists, replace the failed I/O Faulty or missing I/O module causes module. enclosure I/O termination power to be out of range. Legend: s = reset button ON q = LED OFF = reset button OFF = LED ON Troubleshooting Information 1 15 Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 3 of 5) Pattern OC P Error Repair Action Code sq q q q 2D Master enclosure SCSI buses are not all Set the PVA ID to 0 for the enclosure with set to ID 0. the controllers. If the problem persists, try the following repair actions: 1. Replace the PVA module. 2. Replace the EMU. 3. Remove all devices. 4. Replace the enclosure. sq q q q 2E Multiple enclosures have the same SCSI Reconfigure the PVA ID to uniquely ID. identify each enclosure in the subsystem. The enclosure with the controllers must More than one enclosure have the same be set to PVA ID 0; additional enclosures SCSI ID. must use PVA IDs 2 and 3. If the error continues after PVA settings are unique, replace each PVA module one at a time. Check the enclosure if the problem remains. sq q q q q 2F Memory module has illegal DIMM Verify that DIMMs are installed correctly. configuration. sq q 30 An unexpected bugcheck occurred before Reinsert controller. If that does not correct subsystem initialization completed. the problem, reset the controller. If the error persists, try resetting the controller An unexpected Last Failure occurred again, and replace the controller if no during initialization. change occurs. sq q q 31 ILF$INIT unable to allocate memory. Replace controller. Attempt to allocate memory by ILF$INIT failed. sq q q 32 Code load program card write failure. Replace program card. Attempt to update program card failed. Legend: s = reset button ON q = LED OFF = reset button OFF = LED ON 116 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 4 of 5) Pa ttern OC P Error Repair Action Code sq q q q 33 Nonvolatile program memory (NVPM) Verify that the program card contains the structure revision too low. latest software version. If the error persists, replace controller. NVPM structure revision number is lower than can be handled by the software version attempting to be executed. sq q q q Reset controller. 35 An unexpected bugcheck occurred during Last Failure processing. Last Failure Processing interrupted by another Last Failure event. sq q q q 36 Hardware-induced controller reset Replace controller. expected and failed. sq q q q q 37 Software-induced controller reset Replace controller. expected and failed. sq q q 38 Controller operation halted. Reset controller. Last Failure event required termination of controller operation, for example: SHUTDOWN via the command line interpreter (CLI). sq q q q 39 NVPM configuration inconsistent. Replace controller. Device configuration within the NVPM is inconsistent. sq q q q 3A An unexpected NMI occurred during Last Replace controller. Failure processing. Last Failure processing interrupted by a Non-Maskable Interrupt (NMI). sq q q q q 3B NVPM read loop hang. Replace controller. Attempt to read data from NVPM failed. sq q q q 3C NVPM write loop hang. Replace controller. Attempt to write data to NVPM failed. Legend: s = reset button ON q = LED OFF = reset button OFF = LED ON Troubleshooting Information 1 17 Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 5 of 5) Pattern OC P Error Repair Action Code sq q q q q 3D NVPM structure revision higher than Replace program card with one that image. contains the latest software version. NVPM structure revision number is higher than the one that can be handled by the software version attempting to execute. sq q q q q q 3F DAEMON diagnostic failed hard in non- Verify that cache module is present. If the fault tolerant mode. error persists, replace controller. DAEMON diagnostic detected critical hardware component failure; controller can no longer operate. Legend: s = reset button ON q = LED OFF = reset button OFF = LED ON Last Failure Reporting Last failures are automatically displayed on the maintenance terminal (unless disabled via the FMU) using %LFL formatting. The example below shows a Last Failure report: %LFL--H SG > --13-MAY-2001 04:39:45 (time not set)-- Last Failure Code: 20090010 Power On Time: 0. Years, 14. Days, 19. Hours, 58. Minutes, 42. Seconds Controller Model: HSG80 Serial Number: AA12345678 Hardware Version: 0000(00) Software Version: V086P(FF) Informational Report Instance Code: 0102030A Last Failure Code: 20090010 (No Last Failure Parameters) Additional information is available in Last Failure Entry: 1. In addition, Last Failures are reported to the host error log using Template 01, following a
| EK-G80TR-SA T0-DSTAT-IS |