01-26-2019
IBM AIX I/O Performance Tuning suggestions?
I have a IBM Power9 server coupled with a NVMe StorWize V7000 GEN3 storage, doing some benchmarks and noticing that single thread I/O (80% Read / 20% Write, common OLTP I/O profile) seems slow.
Code :
./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 R -D 7177 56.1 0.090 2.58 0.118 0.116 0.001 2.97 0.216 0.212
Are there parameters in AIX we can tune to push the IO/s and MB/s higher?
STORWIZE V7000 GEN3
IBM Power9
Made sure that the V7000 that is a IBMSVC device is using the recommended AIX_AAPCM driver. I have a 1TB volume (hdisk2) mapped as a JFS2 file system.
Code :
# manage_disk_drivers -l
Device Present Driver Driver Options
2810XIV AIX_AAPCM AIX_AAPCM,AIX_non_MPIO
DS4100 AIX_APPCM AIX_APPCM
DS4200 AIX_APPCM AIX_APPCM
DS4300 AIX_APPCM AIX_APPCM
DS4500 AIX_APPCM AIX_APPCM
DS4700 AIX_APPCM AIX_APPCM
DS4800 AIX_APPCM AIX_APPCM
DS3950 AIX_APPCM AIX_APPCM
DS5020 AIX_APPCM AIX_APPCM
DCS3700 AIX_APPCM AIX_APPCM
DCS3860 AIX_APPCM AIX_APPCM
DS5100/DS5300 AIX_APPCM AIX_APPCM
DS3500 AIX_APPCM AIX_APPCM
XIVCTRL MPIO_XIVCTRL MPIO_XIVCTRL,nonMPIO_XIVCTRL
2107DS8K NO_OVERRIDE NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
IBMFlash NO_OVERRIDE NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
IBMSVC AIX_AAPCM NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
# lsdev -Cc disk
hdisk0 Available 01-00 NVMe 4K Flash Disk
hdisk1 Available 02-00 NVMe 4K Flash Disk
hdisk2 Available 05-00-01 MPIO IBM 2076 FC Disk
# lsdev | grep "fw"
sfwcomm0 Available 05-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm1 Available 05-01-01-FF Fibre Channel Storage Framework Comm
sfwcomm2 Available 07-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm3 Available 07-01-01-FF Fibre Channel Storage Framework Comm
sfwcomm4 Available 0A-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm5 Available 0A-01-01-FF Fibre Channel Storage Framework Comm
# lsdev | grep "fcs"
fcs0 Available 05-00 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs1 Available 05-01 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs2 Available 07-00 PCIe2 8Gb 2-Port FC Adapter (77103225141004f3) (not used)
fcs3 Available 07-01 PCIe2 8Gb 2-Port FC Adapter (77103225141004f3) (not used)
fcs4 Available 0A-00 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs5 Available 0A-01 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
# lsattr -l fcs0 -E
DIF_enabled no DIF (T10 protection) enabled True
bus_mem_addr 0x80108000 Bus memory address False
init_link auto INIT Link flags False
intr_msi_1 46 Bus interrupt level False
intr_priority 3 Interrupt priority False
io_dma 256 IO_DMA True
lg_term_dma 0x800000 Long term DMA True
max_xfer_size 0x100000 Maximum Transfer Size True
msi_type msix MSI Interrupt type False
num_cmd_elems 1024 Maximum number of COMMANDS to queue to the adapter True
num_io_queues 8 Desired number of IO queues True
# lsattr -El hdisk2
PCM PCM/friend/fcpother Path Control Module False
PR_key_value none Persistant Reserve Key Value True+
algorithm fail_over Algorithm True+
clr_q no Device CLEARS its Queue on error True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd test_unit_rdy Health Check Command True+
hcheck_interval 60 Health Check Interval True+
hcheck_mode nonactive Health Check Mode True+
location Location Label True+
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_coalesce 0x40000 Maximum Coalesce Size True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x80000 Maximum TRANSFER Size True
node_name 0x5005076810000912 FC Node Name False
pvid 00c2f8708ab7845e0000000000000000 Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 20 Queue DEPTH True+
reassign_to 120 REASSIGN time out value True
reserve_policy single_path Reserve Policy True+
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x20101 SCSI ID False
start_timeout 60 START unit time out value True
timeout_policy fail_path Timeout Policy True+
unique_id 332136005076810818048900000000000001A04214503IBMfcp Unique device identifier False
ww_name 0x5005076810180912 FC World Wide Name False
# lspath -l hdisk2
Enabled hdisk2 fscsi0
Enabled hdisk2 fscsi1
Enabled hdisk2 fscsi4
Enabled hdisk2 fscsi5
Code :
# fcstat -D fcs1
FIBRE CHANNEL STATISTICS REPORT: fcs1
Device Type: PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103) (adapter/pciex/df1000e21410f10)
Serial Number: 1A8270057B
ZA: 11.4.415.10
World Wide Node Name: 0x200000109B4CE35E
World Wide Port Name: 0x100000109B4CE35E
FC-4 TYPES:
Supported: 0x0000010000000000000000000000000000000000000000000000000000000000
Active: 0x0000010000000000000000000000000000000000000000000000000000000000
FC-4 TYPES (ULP mappings):
Supported ULPs:
Small Computer System Interface (SCSI) Fibre Channel Protocol (FCP)
Active ULPs:
Small Computer System Interface (SCSI) Fibre Channel Protocol (FCP)
Class of Service: 3
Port Speed (supported): 16 GBIT
Port Speed (running): 16 GBIT
Port FC ID: 0x020200
Port Type: Fabric
Attention Type: Link Up
Topology: Point to Point or Fabric
Seconds Since Last Reset: 446027
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 681823195 395468348
Words: 298416592384 152800398336
LIP Count: 0
NOS Count: 0
Error Frames: 0
Dumped Frames: 0
Link Failure Count: 1
Loss of Sync Count: 6
Loss of Signal: 3
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 118
Invalid CRC Count: 0
AL_PA Address Granted: 0
Loop Source Physical Address: 0
LIP Type: L_Port Initializing
Link Down N_Port State: Active AC
Link Down N_Port Transmitter State: Reset
Link Down N_Port Receiver State: Reset
Link Down Link Speed: 0 GBIT
Link Down Transmitter Fault: 0
Link Down Unusable: 0
Current N_Port State: Active AC
Current N_Port Transmitter State: Working
Current N_Port Receiver State: Synchronization Acquired
Current Link Speed: 0 GBIT
Current Link Transmitter Fault: 0
Current Link Unusable: 0
Elastic buffer overrun count: 0
Driver Statistics
Number of interrupts: 35576060
Number of spurious interrupts: 0
Long term DMA pool size: 0x800000
I/O DMA pool size: 0
FC SCSI Adapter Driver Queue Statistics
Number of active commands: 0
High water mark of active commands: 20
Number of pending commands: 0
High water mark of pending commands: 20
Number of commands in the Adapter Driver Held off queue: 0
High water mark of number of commands in the Adapter Driver Held off queue: 0
FC SCSI Protocol Driver Queue Statistics
Number of active commands: 0
High water mark of active commands: 20
Number of pending commands: 0
High water mark of pending commands: 1
FC SCSI Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 0
No Command Resource Count: 0
FC SCSI Traffic Statistics
Input Requests: 32627778
Output Requests: 20804443
Control Requests: 2490
Input Bytes: 605283225091
Output Bytes: 1191956455792
Adapter Effective max transfer value: 0x100000
Using XDISK 8.6 for AIX 7.2 from
here with -OD parameter to open file with O_DIRECT to bypass OS caching and benchmark the storage.
Additional runs with different block/thread settings.
Code :
### 8K Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 R -D 7177 56.1 0.090 2.58 0.118 0.116 0.001 2.97 0.216 0.212
### 8K Block, 1 Thread, Sequential I/O Test
./xdisk -S0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 S -D 6461 50.5 0.001 12.1 0.133 0.116 0.001 9.88 0.238 0.213
### 16K Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 16k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
16K 1 0 80 R -D 6796 106.2 0.001 2.63 0.126 0.124 0.179 2.89 0.223 0.219
### 16M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 16M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
16M 1 0 80 R -D 70 1120 12.9 34.1 14.0 14.2 12.9 15.6 13.2 13.5
### 32M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 32M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
32M 1 0 80 R -D 39 1248 23.9 65.0 25.0 24.7 23.8 26.0 24.1 24.3
### 64M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 64M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
64M 1 0 80 R -D 20 1280 46.4 128 47.7 47.5 46.5 49.3 47.6 47.5
### 8K Block, 2 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 2 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 2 0 80 R -D 10059 78.6 0.001 3.36 0.172 0.130 0.001 3.35 0.298 0.260
### 8K Block, 4 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 4 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 4 0 80 R -D 11914 93.1 0.001 4.22 0.295 0.182 0.001 3.60 0.487 0.431
### 8K Block, 8 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 8 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 8 0 80 R -D 13081 102.2 0.001 4.76 0.568 0.478 0.001 4.18 0.775 0.898
### 8K Block, 16 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 16 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 16 0 80 R -D 13302 103.9 0.001 6.57 1.15 1.29 0.001 5.10 1.42 1.45
Last edited by c3rb3rus; 01-28-2019 at 07:08 PM ..
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
can someone tell me a good site to go to in order to learn this. please do not recommen nay books because i dont have interest in that. if you know of any good sites with good straight forward explanation on how to split loads on machines that has excessive loading, please let me know
Also,... (1 Reply)
Discussion started by: TRUEST
1 Replies
2. Filesystems, Disks and Memory
Hi all,
long time ago I posted something, but now, it is needed again :(
Currently, I am handling with a big NFS Server for more than 200 clients, this sever has to work with 256 NFSDs. Because of this huge amount of NFSDs, there are thousands of small write accesses down to the disk and... (3 Replies)
Discussion started by: malcom
3 Replies
3. UNIX for Dummies Questions & Answers
Hi to all,
I'm interested in finding an introduction about Performance Tuning under Unix (or Linux); can somebody please point me in the right direction?
Best regards (1 Reply)
Discussion started by: domyalex
1 Replies
4. Shell Programming and Scripting
Sorry,
This is out of scope of this group.But I require the clarification pretty urgently.
My Oracle database is parallely enabled.
Still,in a particular table queries do not work "parallely" always.
How is this? (9 Replies)
Discussion started by: kthri
9 Replies
5. Shell Programming and Scripting
Hi All,
In last one week, i have posted many questions in this portal. At last i am succeeded to make my 1st unix script.
following are 2 points where my script is taking tooooo long.
1. Print the total number of records excluding header & footer. I have found that awk 'END{print NR -... (2 Replies)
Discussion started by: Amit.Sagpariya
2 Replies
6. AIX
How to do Performance monitoring and tuning in AIX. (2 Replies)
Discussion started by: AIXlearner
2 Replies
7. AIX
Hi all,
From Googling, I found that the basics used for troubleshooting UNIX/AIX performance issues are commands like vmstat, iostat and sar. I believe these are generic commands regardless of what UNIX flavour is in used, only difference being is the format of the output.
In a real case... (2 Replies)
Discussion started by: newbie_01
2 Replies
8. AIX
Please take a look at this system and give your analysis / advice. Can it be tuned to get a better performance?
We are not getting more hardware ressources at the moment.
We have to live with what we have. Application running on the system is SAS. OS is AIX 6.1
Let me know if you need output of... (7 Replies)
Discussion started by: firefox111
7 Replies
9. Solaris
Dear all,
I have a Local zone , where users feel that performance is not good.
Is it wise to collect the inputs from the local zone rather than taking from the global zone.
And also Can I tune from Global zone , so that it will reflect in local zone.
Rgds
rj (2 Replies)
Discussion started by: jegaraman
2 Replies
10. Tips and Tutorials
Overview:
Introduction
What Does Success Mean?
What Does Performance Mean?
Every Picture is Worth a Thousand Words
Work Like a Physicist
Work Like You Walk - One Step at a Time
Learn to Know Your System
Choose Your Weapons!
Tools of the Trade 1 - vmstat
A Little Theory Along the Way -... (1 Reply)
Discussion started by: bakunin
1 Replies