Product Details

Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
To ensure that an array controller is connected to the correct FC-AL path, open
a CLI Window and issue the CLI command show this_controller. If the port 1
topology is in an Offline state, refer to the StorageWorks documentation about
the HSG80 Array Controller ACS and CLI for more information about how to
configure an MA8000/EMA12000 Storage Subsystem.
The topology for Port 1 on each array controller should be:
I Loop Up state (not the Standby state)
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: k-ch10 Cluster Troubleshooting .doc Last Saved On: 6/28/00 1:40 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8 Release 8.0.5/8.0.6 10-15
I LOOP_HARD
Server Node Connectivity to FC-AL Connection
Paths
To verify that your servers are connected to all FC-AL paths, open a CLI
Window and issue the CLI command show connection. The output of this
command shows status information about present and previous connections to
all FC-AL paths including connection name, controller, controller port, adapter
ID address, status, and unit offset. The status entry indicates the port is Online
(connected) or Offline (disconnected).
Rename the connection-name field to reflect the configuration of the array
controllers (for example, SVR1TOP1 for node 1, top controller, port 1). This
will improve your ability to view the FC-AL connections. For further
information about controller properties, see "Setting Up and Configuring a
MA8000/EMA12000 Storage Subsystem" in Chapter 6, "Installation and
Configuration for Oracle8 Release 8.0.5/8.0.6."
A Cluster Node Cannot Connect to the Shared
Drives
Verify connections and accessibility between the servers, the storage hubs or
storage switches, and the MA8000/EMA12000 Storage Subsystem controller
enclosures. See "Verifying Cluster Communication" in Chapter 6, "Installation
and Configuration for Oracle8 Release 8.0.5/8.0.6."
Windows NT Disk Administrator Shows
Storagesets With the Same Label (Dual Image)
To enhance I/O throughput, both ports (1 and 2) on an HSG80 Array
Controller can be configured to different storagesets in the
MA8000/EMA12000 Storage Subsystem. If a unit_offset value has not been
assigned to port 2 on the array controller, then both port 1 and port 2 will see
the same storagesets, as will Windows NT Disk Administrator.
To correct this problem, do the following:
I Ensure that the affected array controller is set to multibus_failover
mode. Refer to "Verifying Array Controller Properties" in Chapter 6,
"Installation and Configuration for Oracle8 Release 8.0.5/8.0.6."
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: k-ch10 Cluster Troubleshooting .doc Last Saved On: 6/28/00 1:40 PM
10-16 Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
I Ensure that port 2 on the affected array controller has been assigned a
unit_offset value. This value ensures that port 2 accesses a different
range of storage devices than port 1. Refer to "Verifying Array
Controller Properties" in Chapter 6, "Installation and Configuration for
Oracle8 Release 8.0.5/8.0.6."
Device or Devices Were Not Found by KGPSA-BC
Device Driver
All of the I/O path devices (storagesets) present on one of the I/O paths should
be viewable from the SCSI driver entry for the associated host bus adapter. If
one or more of these devices cannot be found by the device driver, then you
must identify the cause of the problem and correct it.
To determine whether or not the storagesets can be viewed by a KGPSA-BC
device driver, do the following:
1. From a cluster node, select Settings from the Windows NT Start menu.
Open the Control Panel.
2. Select SCSI Adapters.
3. View the list of installed SCSI controllers. If the KGPSA-BC device
driver was installed correctly, you should see two entries, one for each
host bus adapter in the server: Emulex LP6000/LP7000/LP8000, PCI
Fibre Channel Adapter.
4. Double-click on the entry for each host bus adapter. The I/O path
devices (storagesets) associated with the host bus adapter are displayed.
5. Confirm that the displayed number of storagesets configured to each
host bus adapter is correct.
If one or more of the storage subsystem's storagesets is not displayed, then
check the integrity of the I/O path from the host bus adapter to its storage hub
or storage switch and out to its array controllers. Do the following:
I Confirm that the KGPSA-BC device driver was properly installed. Refer
to "Installing Secure Path for Windows NT" in Chapter 6, "Installation
and Configuration for Oracle8 Release 8.0.5/8.0.6."
I Check the status LEDs on the host bus adapter for error indications.
I Verify that the Fibre Channel cable has been properly installed between
the host bus adapter and its storage hub or storage switch.
I Check the status LEDs on the storage hub or storage switch ports for
error indications.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: k-ch10 Cluster Troubleshooting .doc Last Saved On: 6/28/00 1:40 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8 Release 8.0.5/8.0.6 10-17
I Verify that the Fibre Channel cables have been properly installed from
the storage hub or storage switch to both ports on the array controller.
I Use CLI to obtain status information about the array controller. Verify
that both port 1 and port 2 are active (for example, that connections are
online and that each port was negotiated with its designated FC-AL or
Fibre Channel Fabric device ID).
Devices on One I/O Connection Path Cannot Be
Seen by the Cluster Nodes
For a Redundant Fibre Channel Fabric
If the cluster node cannot see all of the storagesets that are configured to the
Fibre Channel Fabric paths, do the following:
I Verify that the array controllers on the unseen path are set for
Multibus_Failover mode. If an array controller has not been set to
Multibus_Failover mode, then the cluster nodes will only see the
storagesets that are configured to the other array controller in the storage
subsystem.
I If the array controllers have already been set to Multibus_Failover
mode, then check the integrity of the components along the unseen Fibre
Channel Fabric path. Do the following:
G Confirm that the KGPSA-BC device driver was properly installed.
Refer to "Installing Secure Path Under Windows NT" in Chapter 6,
"Installation and Configuration for Oracle8 Release 8.0.5/8.0.6."
G Check the status LEDs on the host bus adapter for error indications.
G Verify that the Fibre Channel cable has been properly installed
between the host bus adapter and its storage switch.
G Check the status LEDs on the storage switch ports for error
indications.
G Verify that the Fibre Channel cables have been properly installed
from the storage switch to both ports on the array controller.
G Use the CLI to obtain status information about the array controller.
Verify that both port 1 and port 2 on each array controller are active.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: k-ch10 Cluster Troubleshooting .doc Last Saved On: 6/28/00 1:40 PM
10-18 Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
For a Redundant Fibre Channel Arbitrated Loop
If the cluster node cannot see all of the storagesets that are configured to the
FC-AL paths, do the following:
I Verify that the array controllers on the unseen path are set for
Multibus_Failover mode. If an array controller has not been set to
Multibus_Failover mode, then the cluster nodes will only see the
storagesets that are configured to the other array controller in the storage
subsystem.
I If the array controllers have already been set to Multibus_Failover
mode, then check the integrity of the components along the unseen
FC-AL path. Do the following:
G Confirm that the KGPSA-BC device driver was properly installed.
Refer to "Installing Secure Path Under Windows NT" in Chapter 6,
"Installation and Configuration for Oracle8 Release 8.0.5/8.0.6."
G Check the status LEDs on the host bus adapter for error indications.
G Verify that the Fibre Channel cable has been properly installed
between the host bus adapter and its storage hub.
G Check the status LEDs on the storage hub ports for error indications.
G Verify that the Fibre Channel cables have been properly installed
from the storage hub to both ports on the array controller.
G Use the CLI to obtain status information about the array controller.
Verify that both port 1 and port 2 on each array controller are active.
Troubleshooting Secure Path
Secure Path Manager Shows Reversed Location for
Top and Bottom Array Controllers
The Secure Path Manager GUI displays the array controllers in each storage
subsystem according to their serial numbers, not according to their physical
order: the array controller with the lower serial number is displayed as
Controller A and the array controller with the higher serial number is
displayed as Controller B, regardless of their actual physical order in the
controller enclosure. If Secure Path Manager shows a controller failure, be
sure you match the displayed failed controller with the correct physical
controller. You can display a controller's serial number by using the CLI show
this_controller command.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: k-ch10 Cluster Troubleshooting .doc Last Saved On: 6/28/00 1:40 PM
11
Chapter
Cluster Troubleshooting With the
MA8000/EMA12000 Storage Subsystem
for Oracle8i Release 8.1.5/8.1.6
Basic Troubleshooting Tips
Power
Verify that the cluster nodes, storage subsystems, and storage hubs or storage
switches for the storage area network (SAN) and the switches or hubs for the
cluster interconnect are all powered on.
The correct power-on sequence is:
I Storage subsystems
I Storage hubs or storage switches (Power is applied to the storage hubs
when the AC cord is plugged in)
I Ethernet switches/hubs or Compaq ServerNet Switches
I ProLiant servers
NOTE: The storage hubs and some network hub or switch devices do not have on/off
buttons; they receive power when the power cord is plugged in. The storage switches
have on/off buttons.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-2 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Physical Connections
Verify that all cable connections between the following components are
properly made. Where redundant connections exist, be sure that the redundant
paths are cabled correctly between:
I Cluster nodes and the cluster interconnect switches or hubs
I Cluster nodes and the client LAN switches or hubs
I Cluster nodes and the storage switches or storage hubs
I Storage hubs or storage switches and the array controllers installed in
each storage subsystem
Access to Cluster Components
Once the physical connections have been checked, ensure that communication
is occurring between all the cluster components.
I To test the redundant Ethernet cluster interconnect and client LAN
connections, use the ping utility. From each cluster node, ping the
Ethernet cluster interconnect adapter or port located in the other nodes.
Similarly, from each cluster node, ping the name, then the address, of
the client LAN Ethernet adapter or port located in the other nodes. A
correct ping response from each node confirms that the Ethernet
connections are correct. For information about using the ping utility, see
"Verifying the Hardware and Software Installation" in Chapter 7,
"Installation and Configuration for Oracle8i Release 8.1.5/8.1.6."
I To test the redundant ServerNet cluster interconnect, use the viping
utility. From each cluster node, ping the ServerNet PCI Adapter located
in every other node. For information about using the viping utility to test
ServerNet cluster interconnect connections, see "Verifying the
ServerNet Cluster Interconnect" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
I To confirm that the shared storage is accessible to each cluster node, run
Secure Path Manager or Windows NT Server Disk Administrator from
each node. Confirm that the same shared disk volumes can be seen from
each node. If the shared storage resources are not consistent across all
nodes, refer to Appendix B, "Diagnosing and Resolving Shared Disk
Problems with the MA8000/EMA12000 Storage System," for more
details.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-3
Software Revisions
The following software components must have the same revision level on each
of the cluster nodes:
I Oracle8i Server Release 8.1.5/8.1.6 with Oracle8i Parallel Server Option
Release 8.1.5/8.1.6
I MA8000/EMA12000 Storage Subsystem drivers
I Compaq Insight Manager
I Compaq Array Configuration Utility
I Secure Path
I OSDs
I Cluster interconnect drivers
I Microsoft Windows NT Server 4.0, including Service Pack 5 or later
For information about currently supported software revisions for the
PDC/O5000, refer to the Compaq Parallel Database Cluster Model
PDC/O5000 Certification Matrix. This document is available at
www.compaq.com/highavailability
Depending on your specific configuration, other software (for example,
customer applications such as packet utilities) might exist that should be of the
same revision level. If the same software is used on all the cluster nodes and
either assists with the cluster's operation or takes advantage of clustering, the
revision level should be the same.
Firmware Revisions
The firmware level must be identical for each of the following cluster
components:
I The system ROM on each server
I Firmware for the host bus adapters
I Firmware for the storage subsystems
For information about currently supported firmware revisions for the
PDC/O5000, refer to the Compaq Parallel Database Cluster Model
PDC/O5000 Certification Matrix. This document is available at
www.compaq.com/highavailability
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-4 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Depending on your specific configuration, other firmware might exist that
must be of the same revision level. A determining factor is whether or not the
hardware device is part of a path that is accessible by more than one cluster
node, more than one storage subsystem, or more than one client computer. If
so, it is required that the firmware level of the hardware devices be identical.
Troubleshooting Oracle8i and OSD
Installation Problems and Error Messages
Potential Difficulties Installing the OSDs With the
Oracle Universal Installer
I The Oracle Universal Installer (OUI) uses an inventory file to track
which components of the OSDs have been installed on each of the
cluster nodes. A single inventory file is maintained on the node from
which the OUI was initially run. All subsequent uses of the OUI must be
made from that same installing node.
I Installing onto a cluster node that already contains some or all of the
OSD components will likely result in an incomplete installation. If
difficulties persist, contact your support representative for assistance.
I The OUI uses the network connections to transfer files from the primary
node to the other cluster nodes. Use the ping utility to ensure that proper
node-to-node connectivity exists.
I The SNMP service should be installed but not running when you use the
OUI to install the OSDs. If the service is running, you must stop it until
the installation is finished.
I The OUI requires administrator rights on all cluster nodes. To verify
connectivity and access rights for each node, issue the following from
the command line on the primary node:
C: net use \\machine_name\C$ * /User:
in which the "*" results in the system prompting you for a password,
and the password is not echoed.
If successful, the net use command returns:
"The command completed successfully."
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-5
I Determine whether or not you need to deinstall a partial OSD
installation. An incomplete installation can result in a partial installation
of the OSDs on the cluster. If this occurs, you must clean up after the
partial installation (by removing it) before you can successfully run the
OUI again. An incomplete OSD installation could result for various
reasons, for example:
G A cable is no longer seated tightly into its connector
G The node from which you are running the OUI cannot communicate
with all other nodes in the cluster
G The user running the OUI does not have administrator privileges on
all cluster nodes
Refer to "Deinstalling a Partial OSD Installation" in Chapter 9, "Cluster
Management for Oracle8i Release 8.1.5/8/1.6."
Unable to Start OracleCMService
I Verify that the cluster interconnect is properly configured.
I Make sure the OSDs have been properly installed on the nodes by
running the OUI on the cluster's primary node. Click Installed
Products and verify that an entry for the OSDs exists.
I Verify that the OracleCMService has properly started by checking its
error log. The default location for the error log is
c:\compaq\ops\cmsvr.log. If not found in the default location, look in the
same directory in which the OSD modules were installed.
I Verify that the Node Manager Service, OracleNMService, is running. If
not, see the error logs in C:\Compaq\ops\nmsrvr.log or
C:\Compaq\ops\nm.log. If OracleNMService is running, check the CM
Server logs in C:\Compaq\ops\cmsrvr.log. Attempt to start the
OracleCMService again.
I Check the Windows NT Server event log for errors. Be sure to check
both the system and application logs.
Unable to Start OracleNMService
I Verify that the OracleNMService has properly started by checking the
error log. The default location for the error log is
C:\Compaq\ops\nmsrvr.log or C:\Compaq\ops\nm.log. If not found in
the default location, look in the same directory in which the OSD
modules were installed.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-6 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
I Verify that Oracle8i software is correctly installed on all the nodes.
Refer to "Installing Oracle Software" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
I Verify the connections and accessibility of the cluster components.
Refer to Chapter 7, "Installation and Configuration for Oracle8i Release
8.1.5/8.1.6."
I Check the Windows NT Server event log for errors. Be sure to check
both the system and application logs.
Unable to Start the Database
I Verify that Object Link Manager is properly installed and running on
each node. Also, verify that Object Link Manager accurately sees all of
the Oracle partitions and their symbolic link associations. Refer to
Appendix B, "Diagnosing and Resolving Shared Disk Problems with the
MA8000/EMA12000 Storage Subsystem Under Windows NT" for more
details.
I Verify that all files created by Net8 Config Assistant were transferred
properly to each respective node.
I Verify the connections and accessibility of the cluster components.
Refer to Chapter 7, "Installation and Configuration for Oracle8i Release
8.1.5/8.1.6."
I Verify that the OracleService is running. Run the Service applet located
in the Windows NT Control Panel. Scan the list of services and check to
see if the OracleService is started.
I Verify that the database links are properly set up on all nodes and that
correct entries exist for all datafiles, including OPS_CMDISK. Refer to
the Oracle8 Parallel Server Setup and Configuration Guide for further
details.
Initialization of the Dynamic Link Library
NM.DLLFailed
If this failure occurs, stop the currently running OracleCMService and
OracleNMService. Restart the OracleNMService, then the OracleCMService.
Do this by running the Service applet located in the Windows NT Control
Panel. Locate each service in the displayed list, highlight it, then stop it.
Finally, highlight the service and restart it.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-7
Troubleshooting Node-to-Node
Connectivity Problems
Nodes Are Unable to Communicate With Each Other
I Check the Windows NT Server event logs for errors.
I Check the OracleCMService error log for information. The default
location for the error log is c:\compaq\ops\cmsrvr.log. If not found in
the default location, look in the same directory in which the OSDs were
installed.
I Verify the connections and accessibility of the cluster components. See
"Verifying Cluster Communication" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
I The cluster interconnect must be on a separate subnet than the client
LAN. Check this by viewing the IP addresses assigned to all
interconnect adapters and to all client LAN adapters.
I If Compaq Insight Manager is loaded and configured, use it to check the
operation statistics and possible error messages for the interconnect
adapter installed in each cluster node.
When Using the Redundant Ethernet Cluster
Interconnect
I Make sure that the redundant Ethernet cluster interconnect components
(adapters, cables, switches or hubs) are properly installed. See
"Installing the Hardware" in Chapter 7, "Installation and Configuration
for Oracle8i Release 8.1.5/8.1.6."
I Make sure that the Ethernet cluster interconnect and the client LAN are
connected to different Ethernet switches or hubs.
I Use the ping utility to test the redundant Ethernet cluster interconnect
connections. For information about the ping utility, see "Verifying the
Hardware and Software Installation" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
I Make sure that all Ethernet traffic not related to the Oracle data is using
the client LAN, not the cluster interconnect.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-8 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
When Using the Redundant ServerNet Cluster
Interconnect
I Make sure that the ServerNet cluster interconnect components (adapters,
cables, and switches) are properly installed as specified under "Installing
the Hardware" in Chapter 7, "Installation and Configuration for Oracle8i
Release 8.1.5/8.1.6." Verify that the cables are installed to the correct
ServerNet path (X or Y).
I Use the viping utility to verify that all nodes can communicate with each
other over the redundant ServerNet cluster interconnect. For information
about using the viping utility to test ServerNet cluster interconnect
connections, refer to "Verifying the ServerNet Cluster Interconnect" in
Chapter 7, "Installation and Configuration for Oracle8i Release
8.1.5/8.1.6."
I Verify that the ServerNet PCI Adapter is installed in the "optimum" PCI
slot in each node. Depending on the server model, the location of the
ServerNet PCI Adapter can affect its performance. For information on
locating the ServerNet PCI Adapter for optimal performance, see the
Compaq Parallel Database Cluster Model PDC/O5000 Certification
Matrix. This document is available at
www.compaq.com/highavailability
viping Does Not Complete Successfully
Sometimes, when using the viping utility to ping the node name (machine host
name), the utility is not successful in communicating with the node specified.
Before concluding that cluster nodes cannot communicate through a Compaq
ServerNet cluster interconnect, try pinging the ServerNet ID of the node in
question.
Get the ServerNet ID of a node by entering a viping command from that node
using the local node name as the operand for viping. For example, if the
machine host name of a node is node1, enter viping node1 to get the
ServerNet ID of node1.
If on node1 the command viping node2 does not return successfully:
1. Get the ServerNet ID of node2 by entering viping node2 at node2. The
viping utility responds in the following way. (0xF0080 is an example of
a ServerNet ID.)
Pinging node2 [0xF0080] with 12 bytes of data:
Reply from 0xF0080: bytes = 12 time = 12ms
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-9
2. Test node1 to node2 ServerNet connectivity using the ServerNet ID by
entering viping 0xF0080 at node1. The viping utility responds in the
following way.
Pinging 0xF0080 with 12 bytes of data:
Reply from 0xF0080: bytes = 12 time = 12ms
If the viping utility does not complete successfully using either the ServerNet
ID or node name, verify that the ServerNet cluster interconnect hardware
(ServerNet PCI Adapters, ServerNet cables, ServerNet Switches) is properly
installed and connected. If the connections are good, you might need to
deinstall and reinstall the ServerNet OSDs. See "Deinstalling the OSDs" in
Chapter 9, "Cluster Management With the MA8000/EMA12000 Storage
Subsystem for Oracle8i Release 8.1.5/8.1.6."
For information about viping options and error diagnostics, see Appendix A,
"viping Utility."
Unable to Ping the Cluster Interconnect or the
Client LAN
If you are using the recommended method of using static IP addresses, ensure
that the hosts file and lmhosts files are properly set up.
When Using the Redundant Ethernet Cluster
Interconnect
I Make sure that the Ethernet cables are properly connected and that the
Ethernet switches or hubs are powered on. Verify that all Ethernet
adapters used as the primary cluster interconnect path are connected to
the same switch and that all adapters used as the redundant cluster
interconnect path are connected to the redundant switch.
I Make sure that the Ethernet drivers are properly installed and configured
on the cluster nodes. Verify that the Ethernet drivers for the interconnect
adapters are the same revision level on all cluster nodes.
I If you are using two dual-port Ethernet adapters in each node to connect
to both the client LAN and the Ethernet cluster interconnect, make sure
that the cables are installed between the correct adapter ports and
switches.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-10 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
When Using the Redundant ServerNet Cluster
Interconnect
I In two-node clusters, ServerNet cables can be installed directly between
the two ServerNet PCI Adapters. Make sure that the ServerNet cables
are properly connected between the X ports and Y ports on both
ServerNet PCI Adapters.
I In clusters with three or more nodes, two ServerNet Switches must be
used to create the X-path and Y-path for ServerNet system area network
communication. Make sure that the ServerNet cables are properly
connected and the ServerNet Switches are powered on. The ServerNet
cables connected to the X-port on the ServerNet PCI Adapters must all
be connected to the same X-path ServerNet Switch. Likewise, the
ServerNet cables connected to the Y-port on the ServerNet PCI
Adapters must all be connected to the same Y-path ServerNet Switch.
I Make sure that the ServerNet drivers are properly installed and
configured on the cluster nodes. Verify the ServerNet drivers for the
ServerNet cluster interconnect are the same revision level on all cluster
nodes.
Node or Nodes Unable to Rejoin the Cluster
I If you have a redundant Ethernet cluster interconnect, use ping to ensure
that the rejoined node or nodes can communicate with other nodes in the
cluster. For information about using the ping utility see "Verifying
Cluster Communications" in Chapter 7, "Installation and Configuration
for Oracle8i Release 8.1.5/8.1.6."
I If you have a redundant ServerNet cluster interconnect, use the viping
utility to ensure that the rejoined node or nodes can communicate with
the other nodes in the cluster. For information about using the viping
utility to test ServerNet cluster interconnect connections, see "Verifying
the ServerNet Cluster Interconnect" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
I Make sure the OSDs have been properly installed on the affected node
or nodes by running the OUI on the cluster's primary node. Click
Installed Products and verify that an entry for the OSDs exists.
I Make sure the OracleCMService and all other Oracle services have
started without error. Run the Services applet in the Windows NT
Control Panel and check to see if the services have started. Run the
Windows NT Event Log and check for error messages about the Oracle
services.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-11
I Run the Object Link Manager GUI from each node to verify that the
Oracle partitions and symbolic link associations are accurately
displayed.
I Verify that the node or nodes have access to the shared storage. Run
Secure Path Manager or Disk Administrator and make sure that all the
shared disk resources appear as expected. If the view is not consistent
across all cluster nodes, refer to Appendix B, "Diagnosing and
Resolving Shared Disk Problems With the MA8000/EMA12000 Storage
Subsystem Under Windows NT," for more details.
Troubleshooting Client-to-Cluster
Connectivity Problems
A Network Client Cannot Communicate With the
Cluster
I Verify that the user ID has sufficient rights to perform the desired
action.
I Make sure proper network connectivity exists by verifying that the
connections from the network client to the cluster are cabled correctly.
Additionally, verify that the network routers/switches/hubs used in the
client LAN are powered on and properly cabled.
I Make sure that pinging an IP address over the client LAN from a client
to the cluster issues a correct reply. Perform ping testing as described
under "Verifying Cluster Communication" in Chapter 7, "Installation
and Configuration for Oracle8i Release 8.1.5/8.1.6."
I Make sure that the TNSNames.ora file is properly configured on client
and cluster nodes as defined by the Oracle Net8 Config Assistant utility.
I Make sure that the Oracle TNSListener is running on the cluster nodes.
Run the Service applet in the Windows NT Control Panel to verify that
it is started.
I Make sure that the Oracle Listener files are set up on the cluster nodes
by using the Oracle TNSPing utility.
I Run the Object Link Manager GUI from each node to verify correct
Oracle partitions and symbolic link associations. Refer to Appendix B,
"Diagnosing and Resolving Shared Disk Problems with the
MA8000/EMA12000 Storage Subsystem Under Windows NT," for
more details.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-12 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Troubleshooting Shared Storage
Subsystem Problems
Verifying Host Bus Adapter Device Driver
Installation
This section addresses potential problems arising from the use of the
MA8000/EMA12000 Storage Subsystem as the shared storage device. This
section does not address storage subsystem problems that are specific to the
MA8000/EMA12000 Storage Subsystem itself or to the use of the storage
subsystem in a standalone server configuration. For detailed coverage of those
issues, refer to the Compaq StorageWorks documentation provided with the
MA8000/EMA12000 Storage Subsystem.
Verifying KGPSA-BC Device Driver Initialization
A problem with the Windows NT HAL (Hardware Abstraction Layer) may
prevent the KGPSA-BC Host Bus Adapter device driver (LP6NDS35.SYS)
from initializing during system boot. The consequences of this problem are:
I None of the shared storage subsystem devices connected to the affected
KGPSA-BC Host Bus Adapter will be available.
I The Windows NT event log entry will be entered in the system event
log. The generic message "At least one service or driver failed during
system setup" is displayed.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-13
To fix this problem, first do the following:
1. Locate the Windows NT event log entry marked "LP6NDS35" to find
the host bus adapter device driver errors.
2. Restart the server.
3. If the driver error message persists, reinstall the KGPSA-BC Host Bus
Adapter device driver.
If the problem is caused by the Windows NT HAL reconfiguring the
KGPSA-BC Host Bus Adapter with conflicting resources, perform the
following workaround. This workaround causes the HAL to use the BIOS
assigned defaults and not to reassign PCI resources.
1. Use Notepad or Edit to edit the BOOT.INI file as follows:
a. Remove the read-only file attribute (for example, attrib -r -h -s
c:\boot.ini).
b. Add the /PCILOCK option to the system boot entry.
c. Reapply the read-only file attribute that was undone in Step 1a.
Example:
[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(2)\WINNT
[operating systems]
multi(0)disk(0)rdisk(0)partition(2)\WINNT="Windows NT Server, Enterprise
Edition Version 4.00"/pcilock
2. Restart the affected server.
Verifying Connectivity to a Redundant Fibre
Channel Fabric
Verify that the array controllers and the cluster nodes are properly connected
to all paths of the redundant Fibre Channel Fabric as described in this section.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-14 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Fibre Channel Fabric Connectivity Guidelines
To ensure that all connections along the Fiber Channel Fabric paths are
correct, verify the following:
I KGPSA-BC Host Bus Adapters have been installed in the same slot
locations across all cluster nodes.
I For each host bus adapter pair installed in the server, verify that the
upper (or leftmost) host bus adapter is connected by Fibre Channel cable
to one storage switch and that the bottom (or rightmost) host bus adapter
is connected to the second storage switch.
I For the array controller pair installed in each storage subsystem, verify
that the two ports on each array controller are connected to different
storage switches.
I If a storage subsystem has three or fewer disk enclosures, verify that a
dual-bus I/O module with two SCSI cables is installed in each disk
enclosure. Also verify that 14 or fewer disk drive units are installed in
each disk enclosure.
I If a storage subsystem has four to six disk enclosures, verify that a
single-bus I/O module with one SCSI cable is installed in each disk
enclosure. Also verify that 12 or fewer disk drive units are installed in
each disk enclosure.
MA8000/EMA12000 Storage Subsystem
Connectivity to Fibre Channel Fabroc
Connection Paths
To ensure that an array controller is connected to the correct Fibre Channel
Fabric path, open a CLI Window and issue the CLI command show
this_controller. If the port 1 topology is in an Offline state, refer to the
StorageWorks documentation about the HSG80 Array Controller ACS and
CLI for more information about how to configure a MA8000/EMA12000
Storage Subsystem.
The topology for Port 1 on each array controller should be:
I Loop Up state (not the Standby state)
I LOOP_HARD
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-15
Server Node Connectivity to Fibre Channel
Fabric Connection Paths
To verify that your servers are connected to all Fibre Channel Fabric paths,
open a CLI Window and issue the CLI command show connection. The
output of this command shows status information about present and previous
connections to all Fibre Channel Fabric paths, including connection name,
controller, controller port, adapter ID address, status, and unit offset. The
status entry indicates the port is Online (connected) or Offline (disconnected).
Rename the connection-name field to reflect the configuration of the array
controllers (for example, SVR1TOP1 for node 1, top controller, port 1). This
will improve your ability to view the Fibre Channel Fabric connections. For
further information about controller properties, see "Setting Up and
Configuring a MA8000/EMA12000 Storage Subsystem" in Chapter 7,
"Installation and Configuration for Oracle8i Release 8.1.5/8.1.6."
Verifying Connectivity to a Redundant Fibre
Channel Arbitrated Loop
Verify that the array controllers and the cluster nodes are properly connected
to all paths of the redundant Fibre Channel Arbitrated Loop (FC-AL) as
described in this section.
Fibre Channel Arbitrated Loop Connectivity
Guidelines
To ensure that all connections along the FC-AL paths are correct, verify the
following:
I KGPSA-BC Host Bus Adapters have been installed in the same slot
locations across all cluster nodes.
I For each host bus adapter pair installed in the server, verify that the
upper (or leftmost) host bus adapter is connected by Fibre Channel cable
to one storage hub and that the bottom (or rightmost) host bus adapter is
connected to the second storage hub.
I For the array controller pair installed in each storage subsystem, verify
that the two ports on each array controller are connected to different
storage hubs.
I If a storage subsystem has three or fewer disk enclosures, verify that a
dual-bus I/O module with two SCSI cables is installed in each disk
enclosure. Also verify that 14 or fewer disk drive units are installed in
each disk enclosure.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-16 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
I If a storage subsystem has four to six disk enclosures, verify that a
single-bus I/O module with one SCSI cable is installed in each disk
enclosure. Also verify that 12 or fewer disk drive units are installed in
each disk enclosure.
MA8000/EMA12000 Storage Subsystem
Connectivity to FC-AL Connection Paths
To ensure that an array controller is connected to the correct FC-AL path, open
a CLI Window and issue the CLI command show this_controller. If the port 1
topology is in an Offline state, refer to the StorageWorks documentation about
the HSG80 Array Controller ACS and CLI for more information about how to
configure an MA8000/EMA12000 Storage Subsystem.
The topology for Port 1 on each array controller should be:
I Loop Up state (not the Standby state)
I LOOP_HARD
Server Node Connectivity to FC-AL Connection
Paths
To verify that your servers are connected to all FC-AL paths, open a CLI
Window and issue the CLI command show connection. The output of this
command shows status information about present and previous connections to
all FC-AL paths including connection name, controller, controller port, adapter
ID address, status, and unit offset. The status entry indicates the port is Online
(connected) or Offline (disconnected).
Rename the connection-name field to reflect the configuration of the array
controllers (for example, SVR1TOP1 for node 1, top controller, port 1). This
will improve your ability to view the FC-AL connections. For further
information about controller properties, see "Setting Up and Configuring a
MA8000/EMA12000 Storage Subsystem" in Chapter 7, "Installation and
Configuration for Oracle8i Release 8.1.5/8.1.6."
A Cluster Node Cannot Connect to the Shared
Drives
Verify connections and accessibility between the servers, the storage hubs or
storage switches, and the MA8000/EMA12000 Storage Subsystem controller
enclosures. See "Verifying Cluster Communication" in Chapter 7, "Installation
and Configuration for Oracle8i Release 8.1.5/8.1.6."
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-17
Windows NT Disk Administrator Shows
Storagesets With the Same Label (Dual Image)
To enhance I/O throughput, both ports (1 and 2) on an HSG80 Array
Controller can be configured to different storagesets in the
MA8000/EMA12000 Storage Subsystem. If a unit_offset value has not been
assigned to port 2 on the array controller, then both port 1 and port 2 will see
the same storagesets, as will Windows NT Disk Administrator.
To correct this problem, do the following:
I Ensure that the affected array controller is set to multibus_failover
mode. Refer to "Verifying Array Controller Properties" in Chapter 7,
"Installation and Configuration With the for Oracle8i Release
8.1.5/8.1.6."
I Ensure that port 2 on the affected array controller has been assigned a
unit_offset value. This value ensures that port 2 accesses a different
range of storage devices than port 1. Refer to "Verifying Array
Controller Properties" in Chapter 7, "Installation and Configuration for
Oracle8i Release 8.1.5/8.1.6."
Device or Devices Were Not Found by KGPSA-BC
Device Driver
All of the I/O path devices (storagesets) present on one of the I/O paths should
be viewable from the SCSI driver entry for the associated host bus adapter. If
one or more of these devices cannot be found by the device driver, then you
must identify the cause of the problem and correct it.
To determine whether or not the storagesets can be viewed by a KGPSA-BC
device driver, do the following:
1. From a cluster node, select Settings from the Windows NT Start menu.
Open the Control Panel.
2. Select SCSI Adapters.
3. View the list of installed SCSI controllers. If the KGPSA-BC device
driver was installed correctly, you should see two entries, one for each
host bus adapter in the server: Emulex LP6000/LP7000/LP8000, PCI
Fibre Channel Adapter.
4. Double-click the entry for each host bus adapter. The I/O path devices
(storagesets) associated with the host bus adapter are displayed.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-18 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
5. Confirm that the displayed number of storagesets configured to each
host bus adapter is correct.
If one or more of the storage subsystem's storagesets is not displayed, then
check the integrity of the I/O path from the host bus adapter to its storage hub
or storage switch and out to its array controllers. Do the following:
I Confirm that the KGPSA-BC device driver was properly installed. Refer
to "Installing Secure Path for Windows NT" in Chapter 7, "Installation
and Configuration for Oracle8i Release 8.1.5/8.1.6."
I Check the status LEDs on the host bus adapter for error indications.
I Verify that the Fibre Channel cable has been properly installed between
the host bus adapter and its storage hub or storage switch.
I Check the status LEDs on the storage hub or storage switch ports for
error indications.
I Verify that the Fibre Channel cables have been properly installed from
the storage hub or storage switch to both ports on the array controller.
I Use CLI to obtain status information about the array controller. Verify
that both port 1 and port 2 are active (for example, that connections are
online and that each port was negotiated with its designated FC-AL or
Fibre Channel Fabric device ID).
Devices on One I/O Connection Path Cannot Be
Seen by the Cluster Nodes
For a Redundant Fibre Channel Fabric
If the cluster node cannot see all of the storagesets that are configured to the
Fibre Channel Fabric paths, do the following:
I Verify that the array controllers on the unseen path are set for
Multibus_Failover mode. If an array controller has not been set to
Multibus_Failover mode, then the cluster nodes will only see the
storagesets that are configured to the other array controller in the storage
subsystem.
I If the array controllers have already been set to Multibus_Failover
mode, then check the integrity of the components along the unseen Fibre
Channel Fabric path. Do the following:
G Confirm that the KGPSA-BC device driver was properly installed.
Refer to "Installing Secure Path Under Windows NT" in Chapter 7,
"Installation and Configuration for Oracle8i Release 8.1.5/8.1.6."
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
Cluster Troubleshooting With MA8000/EMA12000 Storage Subsystem for Oracle8i Release 8.1.5/8.1.6 11-19
G Check the status LEDs on the host bus adapter for error indications.
G Verify that the Fibre Channel cable has been properly installed
between the host bus adapter and its storage switch.
G Check the status LEDs on the storage switch ports for error
indications.
G Verify that the Fibre Channel cables have been properly installed
from the storage switch to both ports on the array controller.
G Use the CLI to obtain status information about the array controller.
Verify that both port 1 and port 2 on each array controller are active.
For a Redundant Fibre Channel Arbitrated Loop
If the cluster node cannot see all of the storagesets that are configured to the
FC-AL paths, do the following:
I Verify that the array controllers on the unseen path are set for
Multibus_Failover mode. If an array controller has not been set to
Multibus_Failover mode, then the cluster nodes will only see the
storagesets that are configured to the other array controller in the storage
subsystem.
I If the array controllers have already been set to Multibus_Failover
mode, then check the integrity of the components along the unseen
FC-AL path. Do the following:
G Confirm that the KGPSA-BC device driver was properly installed.
Refer to "Installing Secure Path Under Windows NT" in Chapter 7,
"Installation and Configuration for Oracle8i Release 8.1.5/8.1.6."
G Check the status LEDs on the host bus adapter for error indications.
G Verify that the Fibre Channel cable has been properly installed
between the host bus adapter and its storage hub.
G Check the status LEDs on the storage hub ports for error indications.
G Verify that the Fibre Channel cables have been properly installed
from the storage hub to both ports on the array controller.
G Use the CLI to obtain status information about the array controller.
Verify that both port 1 and port 2 on each array controller are active.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
11-20 Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Troubleshooting Secure Path
Secure Path Manager Shows Reversed Location for
Top and Bottom Array Controllers
The Secure Path Manager GUI displays the array controllers in each storage
subsystem according to their serial numbers, not according to their physical
order: the array controller with the lower serial number is displayed as
Controller A and the array controller with the higher serial number is
displayed as Controller B, regardless of their actual physical order in the
controller enclosure. If Secure Path Manager shows a controller failure, be
sure you match the displayed failed controller with the correct physical
controller. You can display a controller's serial number by using the CLI show
this_controller command.
Compaq Confidential Need to Know Required
Writer: Vaughn Hasslein Project: Parallel Database Cluster Model PDC/O500 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: l-ch11 Cluster Troubleshooting.doc Last Saved On: 6/28/00 10:58 PM
A
Appendix
viping Utility
The viping utility verifies that cluster nodes can communicate with each other
using the ServerNet cluster interconnect. The following shows the syntax and
option summary of the viping command.
Syntax and Option Summary
viping < node |snid > [-t] [-b size] [-v] [-?]
Node name or ServerNet ID. The node name is the machine
host name, or local node name. 0xF0080 is an example of a
ServerNet ID. A node can determine its own ServerNet ID
by using its machine host name as the operand for viping.
-t Retry the viping command until interrupted. This option is
useful if, for example, you are changing ServerNet cabling
and you want viping to continue trying to establish
communication while making cable changes.
-b size Size of the viping buffer. The default size is 12 bytes. The
maximum size is 256K bytes, which is useful when
diagnosing problems with large data transfers.
-v Enable verbose logging output. This option provides
extended tracing information, which can be useful in
debugging situations.
-? Display viping command syntax and option summary.
Compaq Confidential Need to Know Required
Writer: John Blackburn Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: m-appa viping Utility.doc Last Saved On: 6/28/00 5:03 PM
A-2 Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide
Example
To test ServerNet connectivity from one node to another in which the machine
host names are node1 and node2, enter the following command at node1:
C: \viping node2
If successful, viping returns the following sample output. (0xF0040 is an
example of a ServerNet ID.)
Pinging node2 [0xF0080] with 12 bytes of data:
Reply from 0xF0080: bytes = 12 time = 12ms
Error Diagnostics
The following table shows the errors viping can produce, possible causes, and
what action to take to correct the problem.
Table A-1
viping Errors
Error Possible Cause/Corrective Action
Generic system resource The system might be running low on memory or system
error. resources. Try the operation again. If the problem persists,
restart the system.
Generic vi internal error. Check ServerNet hardware connections. If the problem
persists, restart the system or uninstall and reinstall
ServerNet OSDs.
Data integrity check failed. Potential data corruption has been detected. Check
ServerNet hardware connections. If the problem persists,
restart the system or uninstall and reinstall ServerNet
OSDs.
continued
Compaq Confidential Need to Know Required
Writer: John Blackburn Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: m-appa viping Utility.doc Last Saved On: 6/28/00 5:03 PM
viping Utility A-3
Table A-1
viping Errors continued
Error Possible Cause/Corrective Action
Connection to remote node A communications error with another node has occurred.
timed out. Check ServerNet hardware connections and verify the
ServerNet ID for the remote node. If the problem persists,
restart the system or uninstall and reinstall ServerNet
OSDs.
Request timed out. A communications error with another node has occurred.
Check ServerNet hardware connections. If the problem
persists, restart the system or uninstall and reinstall
ServerNet OSDs.
Unrecognized parameter -a. The format of viping does not conform to normal usage.
Correct the command syntax and try again.
Multiple node names More than one node name has been specified. Correct the
defined node and node. command syntax and try again.
Unable to determine SNID viping could not communicate with the remote node using
for remote node node. the node name. Try again using the ServerNet ID as the
operand for viping.
Compaq Confidential Need to Know Required
Writer: John Blackburn Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: m-appa viping Utility.doc Last Saved On: 6/28/00 5:03 PM
B
Appendix
Diagnosing and Resolving Shared Disk
Problems With the MA8000/EMA12000
Storage Subsystem
Introduction
When one or more nodes in a Compaq Parallel Database Cluster Model
PDC/O5000 cannot properly join the cluster, you need to consider two
possible causes:
I A problem related to the shared disk storage in the cluster's
MA8000/EMA12000 Storage Subsystems is preventing one or more
nodes from accurately viewing or successfully accessing the available
shared storage resources.
I A connectivity problem exists either within the cluster interconnect or
the client LAN, preventing one or more nodes from communicating with
the other nodes or preventing a client from communicating with the
cluster.
This appendix describes the procedures you should follow to diagnose and
resolve suspected shared disk problems in the PDC/O5000.
Figure B-1 identifies, in a flowchart format, the sequence of tasks you should
perform to diagnose, isolate, and resolve shared disk problems in the
PDC/O5000. A detailed description of each task is provided in the following
sections of this appendix.
Compaq Confidential Need to Know Required
Writer: John Blackburn Project: Parallel Database Cluster Model PDC/O5000 for Oracle 8.0.5/8.0.6 and 8.1.5/8.1.6 Administrator Guide Comments:
Part Number: 187244-001 File Name: n-appb Diagnosing Shared Storage Problems.doc Last Saved On: 6/28/00 5:04 PM
187244-001
Page 1 - Page 2 - Page 3 - Page 4 - Page 5 - Page 6 - Page 7 - Page 8 - Page 9 - Page 10 - Page 11 - Page 12 - Page 13 -

3prime solutions for all your HP requirements

     
 


HP is a registered trademark