Product Details

Compaq StorageWorks
HSG80 Array Controller
ACS Version 8.6
Troubleshooting Reference Guide
First Edition (June 2001)
Part Number: EK-G80TR-SA. A01
Compaq Computer Corporation
2001 Compaq Computer Corporation.
Com paq, the Compaq logo, and StorageWorks Registered in U. S. Patent and Trademark Office.
OpenVMS is a trademark of Compaq Information Technologies Group, L.P. in the United States and
other countries.
Intel is a trademark of Intel Corporation in the United States and other countries.
UNIX is a trademark of The Open Group in the United States and other countries.
All other product names mentioned herein may be trademarks of their respective companies.
Confidential computer software. Valid license from Compaq required for possession, use or copying.
Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software
Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under
vendor's standard commercial license.
Com paq shall not be liable for technical or editorial errors or omissions contained herein. The
information in this document is provided "as is" without warranty of any kind and is subject to change
without notice. The warranties for Compaq products are set forth in the express limited warranty
statements accompanying such products. Nothing herein should be construed as constituting an additional
warranty.
Com paq service tool software, including associated documentation, is the property of and contains
confidential technology of Compaq Computer Corporation. Service customer is hereby licensed to use
the software only for activities directly relating to the delivery of, and only during the term of, the
applicable services delivered by Compaq or its authorized service provider. Customer may not modify or
reverse engineer, remove, or transfer the software or make the software or any resultant diagnosis or
system management data available to other parties without Compaq's or its service provider's consent.
Upon termination of the services, customer will, at Compaq's or its service provider's option, destroy or
return the software and associated documentation in its possession.
Printed in the U.S.A.
HSG80 Array Controller ACS Version 8.6
Troubleshooting Reference Guide
First Edition (June 2001)
Part Number: EK-G80TR-SA. A01
Contents
About This Guide
Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Symbols in Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Symbols on Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Rack Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Compaq Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Compaq Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Compaq Authorized Reseller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1
Troubleshooting Information
Typical Installation Troubleshooting Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Troubleshooting Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Significant Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Reporting Events That Cause Controller Operation to Halt. . . . . . . . . . . . . . . . . . . . . . . . 110
Flashing OCP Pattern Display Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Solid OCP Pattern Display Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Last Failure Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Reporting Events That Allow Controller Operation to Continue. . . . . . . . . . . . . . . . . . . . 118
Spontaneous Event Log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
CLI Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Running the Controller Diagnostic Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
ECB Charging Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Battery Hysteresis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Caching Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Read Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
iv HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Read-Ahead Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Write-Through Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Write-Back Caching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Fault-Tolerance for Write-Back Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Nonvolatile Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Cache Policies Resulting from Cache Module Failures . . . . . . . . . . . . . . . . . . . . . . . . 123
Enabling Mirrored Write-Back Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Chapter 2
Utilities and Exercisers
Fault Management Utility (FMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Displaying Failure Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Translating Event Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Controlling the Display of Significant Events and Failures . . . . . . . . . . . . . . . . . . . . . . . . . 25
Video Terminal Display (VTDPY) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Restrictions with VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Running VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
VTDPY Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
VTDPY Display Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Default Screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Controller Status Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Cache Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Device Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Host Ports Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Resource Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Remote Status Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Interpreting VTDPY Screen Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Screen Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Common Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Unit Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Device Performance Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Device Port Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Host Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
TACHYON Chip Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Runtime Status of Remote Copy Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Device Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Controller/Processor Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Resource Performance Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Disk Inline Exerciser (DILX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Checking for Unit Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Finding a Unit in the Subsystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
v
Testing the Read Capability of a Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Testing the Read and Write Capabilities of a Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Format and Device Code Load Utility (HSUTIL). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Configuration (CONFIG) Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Code Load and Code Patch (CLCP) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Clone (CLONE) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Field Replacement Utility (FRUTIL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Change Volume Serial Number (CHVSN) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Chapter 3
Event Reporting Templates
Passthrough Device Reset Event Sense Data Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Last Failure Event Sense Data Response (Template 01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Multiple-Bus Failover Event Sense Data Response (Template 04) . . . . . . . . . . . . . . . . . . . . . . 35
Failover Event Sense Data Response (Template 05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Nonvolatile Parameter Memory Component Event Sense Data Response (Template 11) . . . . . 38
Backup Battery Failure Event Sense Data Response (Template 12) . . . . . . . . . . . . . . . . . . . . . 310
Subsystem Built-In Self Test Failure Event Sense Data Response (Template 13) . . . . . . . . . . 311
Memory System Failure Event Sense Data Response (Template 14) . . . . . . . . . . . . . . . . . . . . 313
Device Services Nontransfer Error Event Sense Data Response (Template 41). . . . . . . . . . . . 315
Disk Transfer Error Event Sense Data Response (Template 51) . . . . . . . . . . . . . . . . . . . . . . . 317
Data Replication Manager Services Event Sense Response (Template 90) . . . . . . . . . . . . . . . 319
Chapter 4
ASC/ASCQ, Repair Action, and Component Identifier Codes
Vendor Specific SCSI ASC/ASCQ Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Component ID Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Chapter 5
Instance Codes
Instance Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Instance Codes and FMU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Notification/Recovery Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Event Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Component ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
vi HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Chapter 6
Last Failure Codes
Last Failure Code Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Last Failure Codes and FMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Parameter Count. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Restart Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Hardware/Software Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Error Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Component ID Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Glossary
Index
vii
Figures
Figure 21. VTDPY commands and shortcuts generated from the Help command. . . . . . . . . . . 210
Figure 22. Sample of the VTDPY default screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Figure 23. Sample of the VTDPY status screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Figure 24. Sample of the VTDPY cache screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Figure 25. Sample of regions on the VTDPY device screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Figure 26. Sample of the VTDPY host screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Figure 27. Sample of the VTDPY resource screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Figure 28. Sample of the VTDPY remote status screen (ACS version 8.6P only) . . . . . . . . . . . 218
Figure 51. Structure of an Instance Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 61. Structure of a Last Failure Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
viii HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Tables
Table 11 Troubleshooting Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Table 12 Flashing OCP Pattern Displays and Repair Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Table 13 Solid OCP Pattern Displays and Repair Actions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Table 14 ECB Capacity Based On Memory Size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Table 15 Cache Policies--Cache Module Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Table 16 Resulting Cache Policies--ECB Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Table 21 Event Code Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table 22 FMU SET Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 23 VTDPY Key Sequences and Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Table 24 VTDPY--Common Data Fields Column Definitions: Part 1 . . . . . . . . . . . . . . . . . . . 220
Table 25 VTDPY--Common Data Fields Column Definitions: Part 2 . . . . . . . . . . . . . . . . . . . 221
Table 26 VTDPY--Unit Performance Data Fields Column Definitions . . . . . . . . . . . . . . . . . . 222
Table 27 VTDPY--Device Performance Data Fields Column Definitions. . . . . . . . . . . . . . . . 224
Table 28 VTDPY--Device Port Performance Data Fields Column Definitions . . . . . . . . . . . . 225
Table 29 Fibre Channel Host Status Display--Known Host Connections . . . . . . . . . . . . . . . . 226
Table 210 Fibre Channel Host Status Display--Port Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Table 211 Fibre Channel Host Status Display--Link Error Counters . . . . . . . . . . . . . . . . . . . . . 227
Table 212 First Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 213 Second Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 214 Remote Display Column Definitions-- ACS Version 8.6P Only. . . . . . . . . . . . . . . . 229
Table 215 Device Map Column Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Table 216 Controller/Processor Utilization Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Table 217 VTDPY Thread Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Table 218 Resource Performance Statistics Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Table 219 DILX Control Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Table 220 Data Patterns for Phase 1: Write Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Table 221 DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
ix
Table 222 HSUTIL Messages and Inquiries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Table 31 Passthrough Device Reset Event Sense Data Response Format . . . . . . . . . . . . . . . . . 32
Table 32 Template 01--Last Failure Event Sense Data Response Format . . . . . . . . . . . . . . . . 34
Table 33 Template 04--Multiple-Bus Failover Event Sense Data Response Format . . . . . . . . 35
Table 34 Template 05--Failover Event Sense Data Response Format . . . . . . . . . . . . . . . . . . . 37
Table 35 Template 11--Nonvolatile Parameter Memory Component Event
Sense Data Response Format 39
Table 36 Template 12--Backup Battery Failure Event Sense Data Response Format . . . . . . 310
Table 37 Template 13--Subsystem Built-In Self Test Failure Event
Sense Data Response Format 311
Table 38 Template 14--Memory System Failure Event Sense Data Response Format . . . . . 313
Table 39 Template 41--Device Services Non-Transfer Error Event Sense
Data Response Format 316
Table 310 Template 51--Disk Transfer Error Event Sense Data Response Format . . . . . . . . . 318
Table 311 Template 90--Data Replication Manager Services Event Sense
Data Response Format (ACS Version 8.6P Only) 320
Table 41 ASC and ASCQ Code Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Table 42 Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 43 Component ID Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Table 51 Instance Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Table 52 Event Notification/Recovery (NR) Threshold Classifications . . . . . . . . . . . . . . . . . . 52
Table 53 Instance Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Table 61 Last Failure Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Table 62 Controller Restart Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Table 63 Last Failure Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
About This Guide
This guide is a troubleshooting resource for HSG80 array controllers running array
controller software (ACS) versions 8.6F, 8.6G, 8.6P, and 8.6S. It contains information on
various utilities, software templates, and event reporting codes.
Text Conventions
This document uses the following conventions to distinguish elements of text:
Keys Keys appear in boldface. A plus sign (+) between two
keys indicates that they should be pressed
simultaneously.
USER INPUT User input appears in a different typeface and in
uppercase
FILENAMES File names appear in uppercase italics.
Menu Options, These elements appear in initial capital letters.
Command Names,
Dialog Box Names
C OMMANDS, These elements appear in upper case.
DIRECTORY NAMES,
NOTE: UNIX commands are case sensitive and will not
and DRIVE NAMES appear in uppercase.
Type When you are instructed to type information, type the
information without pressing the Enter key.
Enter When you are instructed to enter information, type the
information and then press the Enter key.
xii HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
"this controller" The controller serving the current CLI session through a
local or remote terminal.
"other controller" The controller in a dual-redundant pair that is connected
to the controller serving the current CLI session.
Symbols in Text
These symbols may be found in the text of this guide. They have the following meanings.
WARNING: Text set off in this manner indicates that failure to follow directions in the
warning could result in bodily harm or loss of life.
CAUTION: Text set off in this manner indicates that failure to follow directions could
result in damage to equipment or loss of information.
IMPORTANT: Text set off in this manner presents clarifying information or specific instructions.
NOTE: Text set off in this manner presents commentary, sidelights, or interesting points of
information.
Symbols on Equipment
These icons may be located on equipment in areas where hazardous conditions may exist.
Any surface or area of the equipment marked with these symbols indicates
the presence of electrical shock hazards. The enclosed area contains no
operator serviceable parts.
WARNING: To reduce the risk of injury from electrical shock hazards, do not
open this enclosure.
About This Guide xiii
Any RJ-45 receptacle marked with these symbols indicates a Network
Interface Connection.
WARNING: To reduce the risk of electrical shock, fire, or damage to the
equipment, do not plug telephone or telecommunications connectors into
this receptacle.
Any surface or area of the equipment marked with these symbols indicates
the presence of a hot surface or hot component. If this surface is contacted,
the potential for injury exists.
WARNING: To reduce the risk of injury from a hot component, allow the
surface to cool before touching.
Power Supplies or Systems marked with these symbols indicate the
equipment is supplied by multiple sources of power.
WARNING: To reduce the risk of injury from electrical shock,
remove all power cords to completely disconnect power from the
system.
Any product or assembly marked with these symbols indicates that the
component exceeds the recommended weight for one individual to handle
safely.
WARNING: To reduce the risk of personal injury or damage to the
equipment, observe local occupational health and safety requirements and
guidelines for manual material handling.
xiv HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Rack Stability
WARNING: To reduce the risk of personal injury or damage to the equipment, be sure
that:
s The leveling jacks are extended to the floor.
s The full weight of the rack rests on the leveling jacks.
s The stabilizing feet are attached to the rack if it is a single rack installation.
s The racks are coupled together in multiple rack installations.
s A rack may become unstable if more than one component is extended for any
reason. Extend only one component at a time.
Getting Help
If you have a problem and have exhausted the information in this guide, you can get
further information and other help in the locations listed in this section.
Compaq Technical Support
You are entitled to free hardware technical telephone support for your product for as long
you own the product. A technical support specialist will help diagnose the problem or
guide you to the next step in the warranty process.
In North America, call the Compaq Technical Phone Support Center at
1-800-OK-COMPAQ. This service is available 24 hours a day, 7 days a week.
NOTE: For continuous quality improvement, calls may be recorded or monitored.
Outside North America, call the nearest Compaq Technical Support Phone Center.
Telephone numbers for world wide Technical Support Centers are listed on the Compaq
website. Access the Compaq website by logging on to the Internet at
http://www.compaq.com.
Be sure to have the following information available before you call Compaq:
s Technical support registration number (if applicable)
s Product serial numbers
s Product model names and numbers
s Applicable error messages
About This Guide xv
s Add-on boards or hardware
s Third-party hardware or software
s Operating system type and revision level
s Detailed, specific questions
Compaq Website
The Compaq website has latest information on this product as well as the latest drivers.
You can access the Compaq website by logging on to the Internet at
http://www.compaq.com/storage.
Compaq Authorized Reseller
For the name of your nearest Compaq Authorized Reseller:
s In the United States, call 1-800-345-1518.
s In Canada, call 1-800-263-5868.
s Elsew here, see the Compaq website for locations and telephone numbers.
1
Chapter
Troubleshooting Information
This chapter provides guidelines for troubleshooting the controller, cache module, and
external cache battery (ECB). See enclosure documentation for information on
troubleshooting enclosure hardware, such as the power supplies, cooling fans, and
environmental monitoring unit (EMU).
12 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Typical Installation Troubleshooting
Checklist
The following checklist identifies many of the problems that occur in a typical installation.
After identifying a problem, use Table 11 to confirm the diagnosis and fix the problem.
If an initial diagnosis points to several possible causes, use the tools described in this
chapter and then those in Chapter 2 to further refine the diagnosis. If a problem cannot be
diagnosed using the checklist and tools, contact a Compaq authorized service provider for
additional support.
To troubleshoot the controller and supporting modules:
1. Check the power to the enclosure and enclosure components.
Are power cords connected properly?
Is power within specifications?
2. Check the component cables.
Are bus cables to the controllers connected properly?
For BA370 enclosures, are ECB cables connected properly?
3. Check each program card to make sure the card is fully seated.
4. Check the operator control panel (OCP) and devices for LED codes.
See "Flashing OCP Pattern Display Reporting" on page 111, and "Solid OCP Pattern
Display Reporting" on page 113, to interpret the LED codes.
5. Connect a local terminal to the controller and check the controller configuration with
the following command:
SH OW THIS_CONTROLLER FULL
Make sure that the ACS version loaded is correct and that pertinent patches are
installed. Also, check the status of the cache module and the supporting ECB.
In a dual redundant configuration, check the "other controller" with the following
command:
SH OW OTHER_CONTROLLER FULL
Troubleshooting Information 13
6. Use the fault management utility (FMU) to check for Last Failure or "memory system
failure" entries.
Show these codes and translate the Last Failure Codes they contain. See Chapter 2,
"Displaying Failure Entries" and "Translating Event Codes" sections.
If the controller failed to the extent that the controller cannot support a local terminal
for FMU, check the host error log for the Instance or Last Failure Codes. See
Chapter 5 and Chapter 6 to interpret the event codes.
7. Check device status with the following command:
SHOW DEVICES FULL
Look for errors such as "misconfigured device" or "No device at this PTL." If a device
reports misconfigured or missing, check the device status with the following
command:
SHOW device-name
8. Check storageset status with the following command:
SHOW STORAGESETS FULL
Make sure that all storagesets are normal (or normalizing if the storageset is a RAIDset
or mirrorset). Check again for misconfigured or missing devices using step 7.
9. Check unit status with the following command:
SHOW UNITS FULL
Make sure that all units are available or online. If the controller reports a unit as
unavailable or offline, recheck the storageset the unit belongs to with the following
command:
SHOW storageset-name
If the controller reports that a unit has lost data or is unwriteable, recheck the status of
the devices that make up the storageset. If the devices are operating normally, recheck
the status of the cache module. If the unit reports a media format error, recheck the
status of the storageset and storageset devices.
14 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Troubleshooting Table
After diagnosing a problem, use Table 11 to resolve the problem.
Table 11 Troubleshooting Guidelines (Sheet 1 of 6)
Symptom Possible Cause Investigation Remedy
Reset button not lit. No power to subsystem. Check power to subsystem Replace cord or (BA370
and power supplies on enclosure only) AC input box.
controller enclosure.
BA370 enclosure only: Turn off power switch on AC
Make sure that all cooling input box. Replace cooling
fans are installed. If one or fan. Restore power to
more fans are missing or all subsystem.
are inoperative for more
than 8 minutes, the EMU
shuts down the subsystem.
BA370 enclosure only: Press the alarm control
Determine if the standby switch on the EMU.
power switch on the PVA
was pressed for more than
5 seconds.
Failed controller. If the previous remedies fail Replace controller.
to resolve the problem,
check OCP LED codes.
Reset button lit steadily; Various. See OCP LED Codes. Follow repair action using
other LEDs also lit. Table 12.
Device in error or failedset SHOW device FULL. Follow repair action using
Reset button FLASHING;
other LEDs also lit. on corresponding device Table 13.
port with other LEDs lit.
Troubleshooting Information 15
Table 11 Troubleshooting Guidelines (Sheet 2 of 6)
Symptom Possible Cause Investigation Remedy
Use the correct command
Incorrect command See the controller CLI
Cannot set failover to
syntax.
syntax. reference guide for the SET
create dual-redundant
FAILOVER command.
configuration.
Different software versions Check software versions on Update one or both
on controllers. both controllers. controllers so that both are
using the same software
version.
Incompatible hardware. Check hardware versions. Upgrade controllers so that
they are using compatible
hardware.
Controller previously set Make sure that neither Use the SET NOFAILOVER
for failover. controller is configured for command on both
failover. controllers, then reset "this
controller" for failover.
Follow repair action using
Failed controller. If the previous remedies fail
Table 12 or Table 13.
to resolve the problem,
check for OCP LED codes.
Node ID is all zeros. SHOW_THIS to see if node Set node ID using the node
ID is all zeros. ID (bar code) that is located
on the frame in which the
controller sits. See SET
THIS_CONTROLLER
NODE_ID in the controller CLI
reference guide. Also, be
sure to copy in the right
direction. If cabled to the
new controller, use SET
FAILOVER COPY=
OTHER_CONTROLLER. If
cabled to old controller, use
SET FAILOVER
COPY=THIS_CONTROLLER.
16 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 11 Troubleshooting Guidelines (Sheet 3 of 6)
Symptom Possible Cause Investigation Remedy
Reseat DIMM.
Improperly installed DIMM. Remove cache module and
Nonmirrored cache:
make sure that the DIMM is
controller reports failed
fully seated in the slot.
DIMM in Cache A or B.
Failed DIMM. If the previous remedy fails Replace DIMM.
to resolve the problem,
check for OCP LED codes.
Reseat DIMM.
Remove cache module and
Improperly installed DIMM
Mirrored cache:
make sure that DIMMs are
in "this controller" cache
"this controller" reports
installed properly.
module.
DIMM 1 or 2 failed in
Cache A or B. Replace DIMM in "this
Failed DIMM in "this If the previous remedy fails
controller" cache module.
controller" cache module. to resolve the problem,
check for OCP LED codes.
Mirrored cache: Improperly installed DIMM Remove cache module and Reseat DIMM.
"this controller" reports in "other controller" cache make sure that the DIMMs
DIMM 3 or 4 failed in module. are installed properly.
Cache A or B. Failed DIMM in "other If the previous remedy fails Replace DIMM in "other
controller" cache module. to resolve the problem, controller" cache module.
check for OCP LED codes.
Mirrored cache: controller Mem ory module was BA370 enclosure: ECB BA370 enclosure: Connect
reports battery not installed before the cache cable not connected to ECB cable to cache module,
present. module was connected to cache module. then restart both controllers
an ECB. by pushing their reset
Model 2200 enclosure: ECB buttons simultaneously.
not installed or seated
properly in backplane. Model 2200 enclosure:
install or reseat ECB.
Enter the SHUTDOWN
SHOW THIS_CONTROLLER
Primary data and the
Mirrored cache: controller
command on controllers that
indicates that the cache or
mirrored copy data are not
reports cache or mirrored
report the problem. (This
mirrored cache has failed.
identical.
cache has failed.
command flushes the cache
Spontaneous FMU message
contents to synchronize the
displays: "Primary cache
primary and mirrored data.)
declared failed - data
Restart the controllers that
inconsistent with mirror," or
were shut down.
"Mirrored cache declared
failed - data inconsistent
with primary."
Troubleshooting Information 17
Table 11 Troubleshooting Guidelines (Sheet 4 of 6)
Symptom Possible Cause Investigation Remedy
Connect a terminal to the
SHOW THIS_CONTROLLER
Invalid cache. Mirrored-cache mode
maintenance port on the
indicates "invalid cache."
discrepancy. This
controller reporting the error
discrepancy might occur
and clear the error with the
after installing a new
Spontaneous FMU message
following command--all on
controller. The existing
displays: "Cache modules
one line: CLEAR_ERRORS
cache module is set for
inconsistent with mirror
THIS_CONTROLLER
mirrored caching, but the
mode."
INVALID_CACHE
new controller is set for
NODESTROY_UNFLUSHED_
unmirrored caching.
DATA. See the controller CLI
This discrepancy might
reference guide for more
also occur if the new
information.
controller is set for
mirrored caching, but the
existing cache module is
not.
SHOW THIS_CONTROLLER Connect a terminal to the
Cache module might
indicates "invalid cache." maintenance port on the
erroneously contain
controller reporting the error,
unflushed write-back data.
and clear the error with the
This might occur after
No spontaneous FMU
following command--all on
installing a new controller.
message.
one line: CLEAR_ERRORS
The existing cache module
THIS_CONTROLLER
might indicate that the
INVALID_CACHE
cache module contains
DESTROY_UNFLUSHED_
unflushed write-back data,
DATA. See the controller CLI
but the new controller
reference guide for more
expects to find no data in
information.
the existing cache module.
This error might also occur
if installing a new cache
module for a controller that
expects write-back data in
the cache.
18 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 11 Troubleshooting Guidelines (Sheet 5 of 6)
Symptom Possible Cause Investigation Remedy
Replace device.
Cannot add device. Illegal device. See product-specific
release notes that
accompanied the software
release for the most recent
list of supported devices.
Device not properly Check that the device is Firmly press the device into
installed in enclosure. fully seated. the bay.
Failed device. Check for presence of Follow repair action in the
device LEDs. documentation provided with
the enclosure or device.
Failed power supplies. Check for presence of Follow repair action in the
power supply LEDs. documentation provided with
the enclosure or power
supply.
Replace enclosure.
Failed bus to device. If the previous remedies fail
to resolve the problem,
check for OCP LED codes.
Reconfigure storageset with
Cannot configure Incorrect command See the controller CLI
correct command syntax.
storagesets. syntax. reference guide for the ADD
storageset command.
Delete unused storagesets.
Exceeded maximum Use the SHOW command to
number of storagesets. count the number of
storagesets configured on
the controller.
Replace the ECB if required.
Use the SHOW command to
Failed battery on ECB. An
check the ECB battery
ECB or uninterruptible
status.
power supply (UPS) is
required for RAIDsets and
mirrorsets.
Reassign the unit number
Cannot assign unit Incorrect command See the controller CLI
with the correct syntax.
number to storageset. syntax. reference guide for correct
syntax.
Troubleshooting Information 19
Table 11 Troubleshooting Guidelines (Sheet 6 of 6)
Symptom Possible Cause Investigation Remedy
None None
Unit is available but not This is normal. Units are
online. "available" until the host
accesses them, at which
point their status is
changed to "online."
Host cannot see device. Broken cables. Check for broken cables. Replace broken cables.
Check for the required Configure device special files
Host cannot access unit. Host files or device drivers
device special files. as described in the
not properly installed or
installation and configuration
configured.
guide that accompanied the
software release.
Invalid Cache See the description for the See the description for the
invalid cache symptom on invalid cache symptom.
page 17.
Units have lost data. Issue the SHOW UNITS FULL Clear these units with:
command. CLEAR_ERRORS unit-
number LOST_DATA.
Rebuild the storageset, then
Conduct a read scan of the
Unrecoverable read errors
Host log file or
restore storageset data from
storageset using the
might have occurred when
maintenance terminal
a backup source. While the
appropriate utility from the
the controller was
indicates that a forced
controller is reconstructing
host operating system, such
reconstructing the
error occurred when the
the storageset, monitor the
as the "dd" utility for a
storageset. Errors occur if
controller was
host error log activity or
TRU64 UNIX host.
another member fails
reconstructing a RAIDset
spontaneous event reports
while the controller is
or mirrorset.
on the maintenance terminal
reconstructing the
for any unrecoverable errors.
storageset.
If unrecoverable errors
persist, note the device on
which they occurred, and
replace the device before
proceeding.
Use the SHOW storageset- Wait for normalizing
Host requested data from
name command to see if all members to become normal,
a normalizing storageset
then resume I/O to them.
storageset members are
that did not contain the
data. "normal."
110 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Significant Event Reporting
Controller fault management software reports information about significant events that
occur. These events are reported by:
s Maintenance terminal displays
s Host error logs
s OCP LEDs
Some events cause controller operation to halt; others allow the controller to remain
operable. Both types of events are detailed in the following sections.
Reporting Events That Cause Controller Operation to Halt
Events that cause the controller to halt operations are reported three possible ways:
s a FLASHING OCP pattern display
s a SOLID OCP pattern display
s Last Failure reporting
Use Table 12 to interpret FLASHING OCP patterns and Table 13 to interpret SOLID (ON) OCP
patterns. In the Error column of the solid OCP patterns, there are two separate
descriptions. The first denotes the actual error message that appears on the terminal, and
the second provides a more detailed explanation of the designated error.
Use the following legend to interpret both tables as indicated:
s = reset button FLASHING (in Table 12) or ON (in TABLE 13)
= reset button OFF
q = LED FLASHING (in Table 12) or ON (in TABLE 13)
= LED OFF
NOTE: If the reset button is FLASHING and an LED is ON, either the devices on the bus that
corresponds to the LED do not match the controller configuration, or an error occurred in one of
the devices on that bus.
Also, a single LED that is turned ON indicates a failure of the drive on that bus.
Troubleshooting Information 1 11
Flashing OCP Pattern Display Reporting
Certain events can cause a FLASHING display of the OCP LEDs. Each event and the resulting
pattern are described in Table 12.
IMPORTANT: Remember that a solid black pattern represents a FLASHING display. A white
pattern indicates OFF.
All LEDs FLASH at the same time and at the same rate.
Table 12 FLASHING OCP Pattern Displays and Repair Actions
Pattern OC P Error Repair Action
Code
sq 1 Program card EDC error. Replace program card.
sq 4 Timer zero on the processor is bad. Replace controller.
sq q 5 Timer one on the processor is bad. Replace controller.
sq q 6 Processor Guarded Memory Unit (GMU) is Replace controller.
bad.
sq q q B Nonvolatile Journal Memory (JSRAM) Verify the correct upgrade (see the
structure is bad because of a memory controller release notes and cover letters,
error or an incorrect upgrade procedure. if available). If error continues, replace
controller.
sq q q Press the reset button to restart the
D One or more bits in the diagnostic
controller. If this does not correct the
registers did not match the expected
error, replace the controller.
reset value.
sq q q E Memory error in the JSRAM. Replace controller.
sq q q q F Wrong image found on program card. Replace program card or replace
controller if needed.
sq 10 Controller Module memory is bad. Replace controller.
sq q 12 Controller Module memory addressing is Replace controller.
malfunctioning.
sq q q 13 Controller Module memory parity is not Replace controller.
working.
sq q 14 Controller Module memory controller Replace controller.
timer has failed.
Legend:
s = reset button FLASHING = reset button OFF q = LED OFF
= LED FLASHING
112 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 12 FLASHING OCP Pattern Displays and Repair Actions (Continued)
Pa ttern OC P Error Repair Action
Code
sq q q 15 The Controller Module memory controller Replace controller.
interrupt handler has failed.
sq q q q 1E During the diagnostic memory test, the Replace controller.
Controller Module memory controller
caused an unexpected Non-Maskable
Interrupt (NMI).
sq q 24 The card code image changed when the Replace controller.
contents were copied to memory.
sq q 30 The JSRAM battery is bad. Replace controller.
sq q q 32 First-half diagnostics of the Time of Year Replace controller.
Clock failed.
sq q q q 33 Second-half diagnostics of the Time of Replace controller.
Year Clock failed.
sq q q q 35 The processor bus-to-device bus bridge Replace controller.
chip is bad.
sq q q q q 3B An unnecessary interrupt pending. Replace controller.
sq q q q 3C An unexpected fault during initialization. Replace controller.
sq q q q q 3D An unexpected maskable interrupt during Replace controller.
initialization.
sq q q q q 3E An unexpected NMI during initialization. Replace controller.
sq q q q q q 3F An invalid process ran during Replace controller.
initialization.
Legend:
s = reset button FLASHING = reset button OFF q = LED OFF
= LED FLASHING
Troubleshooting Information 1 13
Solid OCP Pattern Display Reporting
Certain events cause the OCP LEDs to display ON or SOLID. Each event and the resulting
pattern are described in Table 13.
Information related to the solid OCP patterns is automatically displayed on the
maintenance terminal (unless disabled with the FMU) using %FLL formatting, as detailed
in the following examples:
%FLL--H SG > --13-MAY-2001 04:39:45 (time not set)-- OCP Code: 38
Controller operation terminated.
%FLL--H SG > --13-MAY-2001 04:32:26 (time not set)-- OCP Code: 26
Memory module is missing.
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 1 of 5)
Pattern OC P Error Repair Action
Code
0 Catastrophic controller or power failure. Check power. If good, reset controller. If
problem persists, reseat controller
module and reset controller. If problem is
still evident, replace controller module.
s 0 No program card detected or kill asserted Make sure that the program card is
by other controller. properly seated while resetting the
controller. If the error persists, try the
Controller unable to read program card.
card with another controller; or replace
the card. Otherwise, replace the
controller that reported the error.
sq q q 25 Recursive Bugcheck detected. Reset the controller. If this fault pattern is
displayed repeatedly, follow the repair
The same bugcheck has occurred three
actions associated with the Last Failure
times within 10 minutes, and controller
code that is repeatedly terminating
operation has halted.
controller execution.
sq q q 26 Indicated memory module is missing. Insert memory module (cache board).
Controller is unable to detect a particular
memory module.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
114 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 2 of 5)
Pa ttern OC P Error Repair Action
Code
sq q q q 27 Memory module has insufficient usable Replace indicated DIMMs.
memory. This indication is only provided when
Fault LED logging is enabled.
sq q 28 An unexpected Machine Fault/NMI Reset the controller.
occurred during Last Failure processing.
A machine fault was detected while a
Non-Maskable Interrupt was processing.
sq q q 29 EMU protocol version incompatible. Upgrade either the EMU microcode or the
software (refer to the release notes that
The microcode in the EMU and the
accompanied the controller software).
software in the controller are not
compatible.
sq q q 2A All enclosure I/O modules are not of the Make sure that the I/O modules in an
same type. extended subsystem are either all single-
ended or all differential, not both.
Enclosure I/O modules are a combination
of single-ended and differential.
sq q q q Make sure that enclosure SCSI bus
2B Jumpers, not terminators, found on
terminators are installed and that no
backplane.
jumpers are installed. Replace the failed
One or more SCSI bus terminators are
terminator if the problem continues.
either missing from the backplane or
broken.
sq q q Make sure that all of the enclosure device
2C Enclosure I/O termination power out of
SCSI buses have an I/O module. If
range.
problem persists, replace the failed I/O
Faulty or missing I/O module causes
module.
enclosure I/O termination power to be out
of range.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
Troubleshooting Information 1 15
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 3 of 5)
Pattern OC P Error Repair Action
Code
sq q q q 2D Master enclosure SCSI buses are not all Set the PVA ID to 0 for the enclosure with
set to ID 0. the controllers. If the problem persists, try
the following repair actions:
1. Replace the PVA module.
2. Replace the EMU.
3. Remove all devices.
4. Replace the enclosure.
sq q q q 2E Multiple enclosures have the same SCSI Reconfigure the PVA ID to uniquely
ID. identify each enclosure in the subsystem.
The enclosure with the controllers must
More than one enclosure have the same
be set to PVA ID 0; additional enclosures
SCSI ID.
must use PVA IDs 2 and 3. If the error
continues after PVA settings are unique,
replace each PVA module one at a time.
Check the enclosure if the problem
remains.
sq q q q q 2F Memory module has illegal DIMM Verify that DIMMs are installed correctly.
configuration.
sq q 30 An unexpected bugcheck occurred before Reinsert controller. If that does not correct
subsystem initialization completed. the problem, reset the controller. If the
error persists, try resetting the controller
An unexpected Last Failure occurred
again, and replace the controller if no
during initialization.
change occurs.
sq q q 31 ILF$INIT unable to allocate memory. Replace controller.
Attempt to allocate memory by ILF$INIT
failed.
sq q q 32 Code load program card write failure. Replace program card.
Attempt to update program card failed.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
116 HSG80 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 4 of 5)
Pa ttern OC P Error Repair Action
Code
sq q q q 33 Nonvolatile program memory (NVPM) Verify that the program card contains the
structure revision too low. latest software version. If the error
persists, replace controller.
NVPM structure revision number is lower
than can be handled by the software
version attempting to be executed.
sq q q q Reset controller.
35 An unexpected bugcheck occurred during
Last Failure processing.
Last Failure Processing interrupted by
another Last Failure event.
sq q q q 36 Hardware-induced controller reset Replace controller.
expected and failed.
sq q q q q 37 Software-induced controller reset Replace controller.
expected and failed.
sq q q 38 Controller operation halted. Reset controller.
Last Failure event required termination of
controller operation, for example:
SHUTDOWN via the command line
interpreter (CLI).
sq q q q 39 NVPM configuration inconsistent. Replace controller.
Device configuration within the NVPM is
inconsistent.
sq q q q 3A An unexpected NMI occurred during Last Replace controller.
Failure processing.
Last Failure processing interrupted by a
Non-Maskable Interrupt (NMI).
sq q q q q 3B NVPM read loop hang. Replace controller.
Attempt to read data from NVPM failed.
sq q q q 3C NVPM write loop hang. Replace controller.
Attempt to write data to NVPM failed.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
Troubleshooting Information 1 17
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 5 of 5)
Pattern OC P Error Repair Action
Code
sq q q q q 3D NVPM structure revision higher than Replace program card with one that
image. contains the latest software version.
NVPM structure revision number is higher
than the one that can be handled by the
software version attempting to execute.
sq q q q q q 3F DAEMON diagnostic failed hard in non- Verify that cache module is present. If the
fault tolerant mode. error persists, replace controller.
DAEMON diagnostic detected critical
hardware component failure; controller
can no longer operate.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
Last Failure Reporting
Last failures are automatically displayed on the maintenance terminal (unless disabled via
the FMU) using %LFL formatting. The example below shows a Last Failure report:
%LFL--H SG > --13-MAY-2001 04:39:45 (time not set)-- Last Failure Code: 20090010
Power On Time: 0. Years, 14. Days, 19. Hours, 58. Minutes, 42. Seconds
Controller Model: HSG80
Serial Number: AA12345678 Hardware Version: 0000(00)
Software Version: V086P(FF)
Informational Report
Instance Code: 0102030A
Last Failure Code: 20090010 (No Last Failure Parameters)
Additional information is available in Last Failure Entry: 1.
In addition, Last Failures are reported to the host error log using Template 01, following a
EK-G80TR-SA
T0-DSTAT-IS
Page 1 - Page 2 - Page 3 - Page 4 - Page 5 - Page 6 - Page 7 - Page 8 -

3prime solutions for all your HP requirements

     
 


HP is a registered trademark