StorWize v3700 and Power8 (S822) AIX, configuration best practice for LUNs?
Hello,
We have an Power8 System (S822) and a IBM StorWize v3700 SAN.
The OS is AIX 7.1.
With this hardware from what I read I need to download/install special SDDPCM drivers, so I did (SDDPCM VERSION 2.6.6.0 (devices.sddpcm.71.rte).
I carved my volumes in the StorWize and presented to my single host (the SAN is directly connected via FC to the host). There is no virtualization of any kind, single bare bone AIX OS install.
The whole process was fairly easy, carve the RAID 10 LUNs, map to host, then in the host after I installed the SDDPCM drivers I run "cfgmgr" and LUNs get detected, from there I create the VGs and JFS2 filesystem.
Code:
# lspv hdisk4
PHYSICAL VOLUME: hdisk4 VOLUME GROUP: usr1
PV IDENTIFIER: 00f9af942979fd5c VG IDENTIFIER 00f9af9400004c000000014b2979fdc3
PV STATE: active
STALE PARTITIONS: 0 ALLOCATABLE: yes
PP SIZE: 512 megabyte(s) LOGICAL VOLUMES: 1
TOTAL PPs: 4607 (2358784 megabytes) VG DESCRIPTORS: 2
FREE PPs: 0 (0 megabytes) HOT SPARE: no
USED PPs: 4607 (2358784 megabytes) MAX REQUEST: 256 kilobytes
FREE DISTRIBUTION: 00..00..00..00..00
USED DISTRIBUTION: 922..921..921..921..922
MIRROR POOL: None
# lsvg usr1
VOLUME GROUP: usr1 VG IDENTIFIER: 00f9af9400004c000000014b2979fdc3
VG STATE: active PP SIZE: 512 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 4607 (2358784 megabytes)
MAX LVs: 256 FREE PPs: 0 (0 megabytes)
LVs: 1 USED PPs: 4607 (2358784 megabytes)
OPEN LVs: 1 QUORUM: 2 (Enabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 30480
MAX PPs per PV: 5080 MAX PVs: 6
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
PV RESTRICTION: none INFINITE RETRY: no
DISK BLOCK SIZE: 512 CRITICAL VG: no
Now if I look at the attributes of my "hdisk4" LUN..
Code:
# lsattr -El hdisk4
PCM PCM/friend/sddpcm PCM True
PR_key_value none Reserve Key True
algorithm load_balance Algorithm True
clr_q no Device CLEARS its Queue on error True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
flashcpy_tgtvol no Flashcopy Target Lun False
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x2000000000000 Logical Unit Number ID False
lun_reset_spt yes Support SCSI LUN reset True
max_coalesce 0x40000 Maximum COALESCE size True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x500507680302218c FC Node Name False
pvid 00f9af942979fd5c0000000000000000 Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
qfull_dly 2 delay in seconds for SCSI TASK SET FULL True
queue_depth 20 Queue DEPTH True
recoverDEDpath no Recover DED Failed Path True
reserve_policy no_reserve Reserve Policy True
retry_timeout 120 Retry Timeout True
rw_timeout 60 READ/WRITE time out value True
scbsy_dly 20 delay in seconds for SCSI BUSY True
scsi_id 0xab0100 SCSI ID False
start_timeout 180 START unit time out value True
svc_sb_ttl 0 IO Time to Live True
timeout_policy fail_path Timeout Policy True
unique_id 332136005076300810886380000000000000B04214503IBMfcp Device Unique Identification False
ww_name 0x500507680306218c FC World Wide Name False
# lspv -l hdisk4
hdisk4:
LV NAME LPs PPs DISTRIBUTION MOUNT POINT
fslv00 4607 4607 922..921..921..921..922 /usr1
Some PCMPATH data..
Code:
# pcmpath query adapter
Total Dual Active and Active/Asymmetric Adapters : 2
Adpt# Name State Mode Select Errors Paths Active
0 fscsi2 NORMAL ACTIVE 429216571 0 6 6
1 fscsi3 NORMAL ACTIVE 70490560 0 6 6
# pcmpath query device
Total Dual Active and Active/Asymmetric Devices : 6
DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 60050763008108863800000000000006
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi2/path0 OPEN NORMAL 71681 0
1* fscsi3/path1 OPEN NORMAL 98 0
DEV#: 3 DEVICE NAME: hdisk3 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 60050763008108863800000000000007
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0* fscsi2/path0 OPEN NORMAL 56 0
1 fscsi3/path1 OPEN NORMAL 10228 0
DEV#: 4 DEVICE NAME: hdisk4 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 6005076300810886380000000000000B
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0* fscsi2/path0 OPEN NORMAL 78 0
1 fscsi3/path1 OPEN NORMAL 49789138 0
DEV#: 5 DEVICE NAME: hdisk5 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 6005076300810886380000000000000C
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi2/path0 OPEN NORMAL 1302771 0
1* fscsi3/path1 OPEN NORMAL 70 0
DEV#: 6 DEVICE NAME: hdisk6 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 6005076300810886380000000000000D
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0* fscsi2/path0 OPEN NORMAL 63 0
1 fscsi3/path1 OPEN NORMAL 20690963 0
DEV#: 7 DEVICE NAME: hdisk7 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 6005076300810886380000000000000E
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi2/path0 OPEN NORMAL 427841922 0
1* fscsi3/path1 OPEN NORMAL 63 0
My question is about the "algorithm", currently (by default) this is set to load_balance.
The host (S822) has a dual FC path to each canister on the V3700. The additional enclosures on the V3700 also have dual path via SAS to each other enclosure's canister. So the idea is we can loose one path and continue to operate without downtime.
Now if algorithm = load_balance and a single path fails (lets say I unplug 1 of the FC cables), what will happen? Will it switch into failover or must I set this to "failover" at the start?
If the algorithm is "failover", is only 1 path active and the other on stand-by?
My goal is to have full redundancy.
Any experts able to give some feedback?
---------- Post updated at 06:31 PM ---------- Previous update was at 06:20 PM ----------
Oh almost forgot to mention vg_root is not off the SAN, the host has 2 internal disks that I've setup for vg_root.
Moderator's Comments:
edit by bakunin: this is hardly "Unix for Dummies" and it is very AIX-specific. I transfer this thread over to the specialized AIX board.
---------- Post updated 03-04-15 at 12:50 PM ---------- Previous update was 03-03-15 at 06:31 PM ----------
It seems that according to this article Guide to selecting a multipathing path control module for AIX or VIOS I could leave this as "load_balance" and loose a path, the PCM can auto detect dead paths and recover them when they are back up without user interaction. Time to test
Just to clarify, since this is not OS related, load_bal it is an algorithm which is used to send I/O requests to disks using FC paths.
In this case it will keep track which path has less load to issue I/O over it.
If it has only one path active due some fault, it will not use any algorithm to balance the traffic, since it is not needed. There is only one path to issue I/O commands over.
Given the mess SDDpcm + HAS can make with being forgotten during AIX updates, or especially during AIX migrations, and that the SDDpcm code is already being integrated into AIX native MPIO, I'd have to think long and hard about whether it's even worth installing these days.
Either way AIX / SDDpcm will cope with lost paths in much the same way as each other, as long as there is an available path you should have nothing to worry about.
Oh, and things like dynamic tracking and fast fail for the fibre interface attributes and the way the two interact.... IBM Knowledge Center
Lots to read, several decisions to make, please feel free to keep discussing things here if anything is unclear - I appreciate it's a bit of a mine field when you are new to this arena.
Thanks for this, I was under some time pressure and I ended up going with the SDDpcm drivers for the SAN hdisks as suggested by IBM TS and things are looking good so far.. I left the other parameters such as the load balancing policy all to the defaults.
i have directly connect AIX pwer 7 to lenovo v3700 v2 without San through fiber card. but i cannot see the WWPN to add AIX as host in management console. (10 Replies)
Hello,
Got a IBM Power8 box (S822) that I am configuring for replacement of our existing IBM machine.
Wanted to touch base with the expert community here to ensure I don't miss anything critical in my setup/config of AIX.
Did a fresh AIX 7.1 install on the internal scsi hdisk, mirror'ed... (3 Replies)
where I'm working does not have any AIX box/servers for testing eventhough we're not heavily invested on UNIX. It is just sometimes there are a few client's servers that need AIX related stuff to be done.
last time I checked, I could find old AIX tower on ebay but there's no more.
p/s: on job... (4 Replies)
Dears,
Kindly as preparing to get AIX certification I'd like to have your guide & support where can I find dump or exam forms for below :-
Test 000-103: AIX 6.1 Basic Operations
Test 000-221: AIX 7 Administration
I want to look at real questions and try to solve it , is it possible to... (1 Reply)
Hi team,
2E493F13 0612155010 P H dac0 ARRAY OPERATION ERROR
2E493F13 0612155010 P H dac0 ARRAY OPERATION ERROR
2E493F13 0612155010 P H dac0 ARRAY OPERATION ERROR
2E493F13 0612154910 P H dac0 ARRAY OPERATION ERROR
2E493F13 0612154910 P H dac0 ... (4 Replies)
Hi,
I would like to know if there is a command similar to scsimgr in HP-UX that
can help me change the algorithm and reserve_policy attributes of all luns presented to an AIX host.
Otherwise I would have to use,
chdev -l hdiskX -a algorithm=round_robin reserve_policy=no_reserve
in a... (1 Reply)
Hi everyone,
Can anyone explain the best practice for keeping an AIX environment up to date? I have 12 AIX LPARs and a whole slew of vulnerabilities. These vulnerabilities vary from OS issues to WebSphere/MQ issues and more. I went through all of them and created a list of APARs I need to apply... (3 Replies)
Does anyone know if there's an easy way to get an AIX system to experiment with? It's easy to learn Linux because it's readily available and can be installed just about anywhere, but it looks like AIX requires proprietary hardware.
My end goal is to test the behavior of packet filtering rules... (4 Replies)