olrad(1M) olrad(1M)
NAME
olrad - command for OnLine Addition/Replacement/Deletion of PCI I/O cards and Online Addition/Deletion of I/O chassis
SYNOPSIS
Adding Card Commands
Replacing Card Commands
Deleting Card Commands
I/O Chassis Add Command
I/O Chassis Delete Command
Other Commands
slot_id
|slot_hw_path
flag slot_id
slot_id
DESCRIPTION
The command provides the ability to perform On-Line Addition, Replacement and Deletion of I/O cards.
performs Critical Resource Analysis (CRA) of the system before performing any OLA/R/D operation. This is to ensure that the system is not
left in an inconsistent state after a PCI card is added/replaced/deleted.
The command also provides the ability to perform On-Line Addition and Deletion of an I/O chassis associated with a Cell.
Only users with root privileges may use this command.
On systems with the capability to handle certain PCI hardware errors during the operation of PCI I/O cards, the command provides the option
to attempt recovery from such errors. The availability of this feature is dependent on the platform and operating system environment, as
described in the at
Arguments
The following arguments are used in the command.
slot_id Slot ID of an OLA/R/D capable slot. A slot ID is a list of one or more numbers separated by dashes. Each number repre-
sents a component of the physical location of the slot. The user can use the slot ID to locate the slot. The sequence
of numbers in the slot ID is platform dependent. On certain platforms, the slot ID contains only the slot number. On
certain other platforms, including Superdome, the format of the slot ID is:
Cabinet#-Bay#-Chassis#-Slot#
slot_hw_path Hardware path of an OLA/R/D capable slot.
interface_hw_path Hardware path of an interface under an OLA/R/D capable slot.
device_hw_path Any hardware path under an OLA/R/D capable slot.
cell_hw_path Hardware path of a Cell in the system. The user can use the command to find the hardware path. The cell_hw_path is also
equivalent to the global slot number as used with the command.
Options
The following options are supported.
Post add phase. The slot power is turned ON, the drivers associated with
all affected slots are resumed. Then is run and if the card is claimed, the driver scripts, for the current slot and for
affected slots (if any), are run and the attention LED at the corresponding slot is turned OFF.
Configures the I/O components associated with the specified Cell.
This operation is required, because when a Cell is added to the system, the attached I/O components are not configured in
by default, so they have to be explicitly configured using this option. See the section below.
NOTE: The Cell identifier specified as an argument to the and commands has different formats. For the command, the Cell
identifier is specified either in the global slot number format or in the local (cabinet#/slot#) format. For the com-
mand, the Cell identifer is specified in the global slot number format only. The global slot number of the Cell is also
equivalent to the hardware path of the Cell as displayed by the command. Refer to parolrad(1M) for more details about
the different formats for specifying a Cell.
Prepare to add a card to the system at the specified slot.
Critical Resource Analysis (CRA) is run to ensure that the current card addition onto the system will not cause disrup-
tion to the overall system operation. The driver scripts and for affected slots (if any) are run and the drivers associ-
ated with the affected slots are suspended. The slot power is turned OFF, and the attention LED at the corresponding
slot is set to BLINK mode.
If the option is specified, it overrides critical analysis (CRA) results. See the description for the option.
Runs Critical Resource Analysis (CRA) routine only on the specified
slot_id and displays the results. It checks for critical resources on all affected hardware paths associated with the
specified slot. It analyzes file systems, volumes, processes, networking, swap, dump and generates a report of affected
resources. It lists the severity levels and the meanings for each.
CRA_SUCCESS no affected resources in use.
CRA_WARNINGS
resources in use on affected device(s) but none are deemed critical.
CRA_DATA_CRITICAL
probable data loss, only proceed with the user's permission.
CRA_SYS_CRITICAL
likely to bring down the user's system.
CRA_FAILURE
some internal CRA error encountered.
Users are advised to use this option first to check out whether the intended OL* operation is safe and would not cause
disruption in the functioning of the system.
Displays the device information (Device_ID, Vendor_ID, Revision_ID, etc)
of all the interface devices under the specified slot. Output fields are detailed below, some descriptions are platform
dependent.
The fields Device_ID, Vendor ID, Subsystem ID, Subsystem_Vendor_ID, Revision_ID, Class, Status, Command deal with the
identification of the interface as per the PCI specification and the values for these fields are displayed in hexadeci-
mal.
displays the hardware path of the particular Interface device being displayed.
displays the interface driver name that claimed the interface.
displays the PCI Device ID that identifies a particular interface and is allocated by the vendor.
displays the PCI Vendor ID that identifies the manufacturer of the device.
and display PCI Subsystem ID and PCI Subsystem Vendor ID which uniquely identify the various interface cards manufactured
by the same vendor.
specifies an interface specific revision identifier. The value for this field is chosen by the vendor.
displays the PCI Class that identifies the generic function of the interface.
displays the content of Status register associated with the interface.
displays the content of the Command register associated with the interface.
displays "Yes" or "No" depending on whether there are multiple interfaces of the same kind under the slot.
(For instance, Multi-func will be set to "Yes" if there are 2 SCSI ports on the I/O card, and "No" if there is a single
SCSI port and a single Ethernet port on the card.)
displays "Yes" or "No" depending on whether the device is a PCI-to-PCI bridge device.
displays "Yes" or "No" depending on whether the interface is capable of operating at 66 MHz frequency.
displays the power consumption of the device in units of 0.1 Watts or N/A (Not Available).
(For instance, if the field displays a value of 150, then the power consumption of the interface is 150 x 0.1 = 15
Watts).
displays all the bus frequencies at which the interface is capable of running.
This performs the post delete operation. This should always be
performed after an operation to complete the delete operation of a card at the slot.
Chassis Delete operation. De-Configures all the I/O interfaces
under the Cell specified by its cell_hw_path. A (cumulative) Critical Resource Analysis (CRA) is run to ensure that the
command execution will not cause any disruption to the system operation. The driver scripts and are run prior to and
after de-configuring an I/O interface respectively. Only an I/O Chassis associated with a floating Cell can be deconfig-
ured online. See the section below.
NOTE: The Cell identifier specified as an argument to the and commands has different formats. For the command, the Cell
identifier is specified either in the global slot number format or in the local (cabinet#/slot#) format. For the com-
mand, the Cell identifer is specified in the global slot number format only. The global slot number of the Cell is also
equivalent to the hardware path of the Cell as displayed by the command. Refer to parolrad(1M) for more details about
the different formats for specifying a Cell.
If the option is specified, it overrides critical analysis (CRA) results. See the description for the option.
This operation may have to be performed before a Cell can be deleted from the system. See the description for the option
and the section below.
Delete a card on the system at the specified slot.
Critical Resource Analysis is run to ensure that the current card removal on the system will not cause disruption to the
system operation. The driver script associated with the current slot is run prior to the deletion. The target slot is
powered off and the driver instances and associated data structures are removed. The attention LED is set to BLINK at
the corresponding slot when the operation is in progress. When the operation completes, the driver scripts are run.
If the option is specified, it overrides critical analysis (CRA) results. See the description for the option.
Re-attaches the driver module to the attach chain.
This command should only be run if a previous operation failed, so as to not leave the driver in detached state. The
driver name should correspond to the name shown in the output.
This option is provided for driver developers only. It will not work as a standalone command and can only be invoked
from the "DLKM (Dynamically Loadable Kernel Module)" context. Refer to the (DDG) available at for more details on DLKM.
Detach the driver from the attach chain and delete all the active
interfaces claimed by the specified driver module. If this command fails, should be executed to re-attach the driver.
The driver name should correspond to the name shown in the output. Critical Resource Analysis is run to ensure that the
removal of the driver module will not cause any disruption to the system operation.
This option is provided for driver developers only. It will not work as a standalone command and can only be invoked
from the "DLKM (Dynamically Loadable Kernel Module)" context. Refer to the (DDG) available at for more details on DLKM.
Lists the affected slot IDs for the specified slot.
Displays the output in machine readable format.
It can be used with the following options: and
The option, if specified, overrides the "data critical" errors returned by CRA. It is important to note that will not allow
"system critical" errors to be overridden and that automatically overrides "warnings".
Irrespective of whether is specified or not, Critical Resource Analysis (CRA) routines are run before an OLA/R/D opera-
tion, to ensure that the current OLA/R/D operation does not interrupt the normal operation of the system; in other words,
to identify "critical" errors.
The "data critical" errors are typically not critical to the system, but they may be critical to the user. Hence, the
user needs to decide whether or not to use the option for overriding these types of errors.
Displays the slot ID for the specified device or interface hardware path.
Displays the hardware paths of the interface node(s) for the specified
slot.
Controls the state of the Attention LED for the given slot. The valid
values for this flag option are: (LED blinking) and Based on the flag value, the slot Attention LED is set to the appro-
priate state. The flags are not case-sensitive.
Verifies that all the I/O interfaces under the specified Cell are inactive
and have been de-configured from the Cell. (This is a pre-requisite for performing a Cell-OnLine Delete or Cell-OLD
operation). Refer to parolrad(1M) for more details regarding the Cell-OLD operation.
Display the number of OLA/R/D capable slots in the system.
Controls the state of the
power indicator. Currently, the only valid value for this flag option is: The option can be used with to set the power
indicator to follow the specified slot's power state; in other words, the power indicator is turned solid ON if the slot
power is ON, or the power indicator is turned OFF, if the slot power is OFF. The flag is not case sensitive.
This option has been obsoleted in
Displays the status of all OLA/R/D capable slots in the system.
In the output, slots with the same bus number share the same PCI Bus. Output fields are detailed below; some descrip-
tions are platform dependent. N/A means Not Applicable.
On systems with OLA/R/D capable PCI-Express slots, the output fields are slightly varied. See the section below for
detailed descriptions of the fields displayed for such slots.
displays the slot_id.
displays the slot_hw_path.
identifies the I/O Bus corresponding to the slot.
displays the maximum operating speed of the PCI Bus attached to the slot.
displays the current operating speed of the PCI Bus attached to the slot. The card inserted into the slot determines the
current operating speed, together with the capability of the slot's PCI Bus.
displays the slot power status.
displays whether the slot is occupied or not.
displays if the card in the slot is suspended or not.
displays the OL* capability of the interface driver/s that claimed the PCI device/s present in the slot. field displays
whether the interface driver/s are capable of OnLine Add/Replace operations. field displays whether the interface
driver/s are capable of OnLine Deletion operation.
displays the maximum operating mode of the PCI Bus attached to the slot.
displays the current operating mode of the PCI Bus attached to the slot. The card inserted into the slot determines the
current operating mode, together with the capability of the slot's PCI Bus. PCI and PCI-X are examples of different
operating modes.
On systems with OLA/R/D capable PCI-Express slots, the output fields are slightly varied. The detailed description of
the fields displayed for such slots are described below:
o (Expressed in Gigabits / Second) indicates the maximum link speed possible for the PCI-Express Link at the slot.
o (Expressed in Gigabits / Second) indicates the negotiated link speed of the PCI-Express Link at the slot.
o indicates the maximum link width supported by the PCI-Express link at the slot.
For example: means the maximum link width supported by a PCI-Express Link at the slot is 8 lanes.
o indicates the negotiated width of the PCI-Express Link at the slot.
o indicates the current operating mode of the slot. For PCI-Express slots mode is displayed as "PCIe".
Post Replace phase. The target slot power is turned ON. The suspended
drivers are resumed and the driver scripts for the current slot and the affected slots (if any) are run. The attention
LED at the corresponding slot is set to OFF.
On systems with the capability to handle certain PCI hardware errors during the operation of PCI I/O cards, the post
replace phase can be used to attempt recovery of the PCI card and corresponding I/O slot from such errors.
Prepare to replace a card on the system at the specified slot.
Critical Resource Analysis (CRA) is run to ensure that the current card replacement on the system will not cause disrup-
tion in the functioning of the system. The driver scripts and for the affected slots (if any) and the current slot are
run. The drivers associated with the current slot and affected slots are suspended. The target slot is powered off and
the attention LED is set to BLINK at the corresponding slot.
If the option is specified, it overrides critical analysis (CRA) results. See the description for the option.
Displays driver information, such as current state, time-out, and so on.
Output fields are detailed below.
displays the interface driver name.
displays the interface driver state. State will be RUNNING if the driver is active. State will be SUSPENDED if the
driver is suspended. When the driver is in a transition state (say from RUNNING state to SUSPENDED state), this field
will indicate a state change in progress. For the rare occurrence of any internal errors during a driver state transi-
tion, this field will indicate an operation timed out status.
displays the approximate time required to suspend the interface driver. The value displayed accounts for worst case sce-
narios, and the time taken would normally be less than this.
displays the approximate time required to resume the interface driver. The value displayed accounts for worst case sce-
narios, and the time taken would normally be less than this.
displays the approximate time required to delete the driver instance. The value displayed accounts for worst case sce-
narios, and the time taken would normally be less than this. This field will be valid only if the target operating envi-
ronment supports OnLine Deletion.
field is for future enhancements.
During the On-Line Replace operation of a card at a slot, runs and driver scripts during the pre-replace of the card and driver script in
the post-OLR phase
During the On-Line Addition operation of a card at the slot, runs the driver script in the post add phase. Note that there are no and
driver scripts for OLA
During the On-Line Delete operation of a card at the slot, runs and the driver scripts associated with the card at the slot.
For a given OL* operation on a slot, driver scripts will always be run for all the affected slots (meaning, slots sharing the same power
domain)
An audit trail is logged onto log file whenever an OLA/OLR/OLD operation is initiated (see nettl(1M)). This information is also written to
standard output.
PCI Error Handling
Some systems have the capability to handle certain PCI hardware errors during the operation of PCI I/O cards. When such errors occur, the
operating system will automatically try and recover from the error. However, on certain occasions the system cannot recover from the error
automatically. In this scenario, the software states of the components in error will be marked ERROR in output. If this occurs, the fol-
lowing sequence can be tried from the command to attempt a manual recovery at the slot:
1. If the slot is not already suspended, suspend it using:
2. Try a post replace operation at the slot using:
If the card/slot is recovered from the error and the post replace operation succeeds, software states of the components recovered from
the error will be restored to CLAIMED in output. If the post replace operation fails and the error persists, one of the reasons could
be that the card has gone bad. The card in error can be replaced with another card of the same type, and a post replace operation can
be tried with the replaced card.
A complete description on PCI Error Handling is not covered here. For more details refer to documents on available under the section at
the website. Note that the sequence mentioned here for PCI Error Handling is generic. This is subject to changes depending on different
platforms and operating system releases.
Logging
uses the subsystem to log errors and audit trail for all OLA/R/D operations performed on slots. See nettl(1M).
makes use of the subsystem formatter to format the log messages.
The following details are not logged:
o CRA report when performing OLA/R/D,
o CRA report when using the option,
o Output of view information options such as and
RETURN VALUE
returns the cra-return values when invoked with (cra-only) option. The valid values are as follows:
For all other options
returns the following:
Successful completion.
Failure,
also logs a message on the NetTL log file and to standard error.
EXAMPLES
Adding a New Card
1. Get the information about all the OLA/R/D capable slots. Make note of the slot_id field:
2. Prepare to add:
3. Physically insert the card into the slot.
4. Post add:
Replacing a Card
1. Get information about all the OLA/R/D capable slots. Make note of the slot_id field:
2. Prepare to replace:
3. Replace the faulty card in the slot with a working card. The new card must be identical as the card being replaced.
4. Post Replace:
Deleting a Card
1. Get information about all the OLA/R/D capable slots. Make note of the slot_id field:
2. Delete the card:
3. Post Delete:
WARNINGS
Any changes to an I/O chassis configuration done by adding or removing I/O cards while the I/O chassis is in inactive state will not be
effective by default. I/O Chassis can be in inactive state either because it is connected to an inactive cell or it has been deconfigured
using the chassis delete operation. The supported procedure for inserting or removing PCI I/O cards in such an inactive I/O chassis is as
follows:
1. Power down the cell attached to the I/O chassis using or the MP.
2. Add or remove I/O cards to I/O chassis, using latches to disable slot power if necessary.
3. Power up the attached cell using or the MP.
FILES
log file containing
audit trail and errors.
SEE ALSO
ioscan(1M), netfmt(1M), nettl(1M), parolrad(1M).
and available under the section at
olrad(1M)