Sponsored Content
Operating Systems AIX Should I be worried about my AIX Cluster? Post 302441772 by troym72 on Monday 2nd of August 2010 09:53:40 AM
Old 08-02-2010
Well, the server is not REALLY memory bound according to the output of the top command below. My application just does so much I/O that 8gb of the 24gb are allocated to I/O Buffers by the OS. Are there parameters to limit the amount of memory used for I/O Buffers?

The "lm" processes are database lock managers. The "hciengine" processes are the interface engines running which process transactions and route/tranform/send them to their destinations. The interface engines write a copy of the transaction to a Raima database about 15 times during the process of getting them where they are going, thus the large amount of write activity large amount memory used for I/O Buffers.

Code:
load averages:  3.69,  3.92,  3.57;                                    08:23:34
912 processes: 837 idle, 75 running
CPU states: 90.5% idle,  0.8% user,  8.5% kernel,  0.0% wait
Memory: 24G total, 8514M buf, 5304M sys, 163M free
Swap: 12G total, 10G free
   PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU    CPU COMMAND
     1 root      20    0   836K  516K run     5:08  0.00% 21474836.48% init
278892 hci       20    0  1488K  912K run    20.3H  1.08%  7.57% lm
1499428 hci       20    0    73M 7524K run    96:45  1.16%  4.48% hciengine
700420 hci       20    0  1420K  824K run    25.6H  1.36%  2.72% lm
188868 hci       20    0    52M   42M run    51:46  0.62%  2.63% hciengine
1642800 hci       20    0    43M   32M run   262:27  0.27%  2.34% hciengine
1491128 hci       20    0    63M   53M run    45:35  0.55%  2.27% hciengine
1294442 hci       20    0    81M   57M run   209:38  0.19%  2.26% hciengine
4096128 hci       20    0    39M   28M run   101:58  0.16%  2.22% hciengine
1864184 hci       20    0    42M   30M run   124:56  0.19%  2.17% hciengine
692276 hci       20    0    91M   48M run   222:20  0.20%  2.16% hciengine
1860016 hci       20    0   104M   90M run   997:11  1.27%  1.82% hciengine
4485126 hci       20    0    46M   36M run    63:26  0.14%  1.74% hciengine
905354 hci       20    0    44M   24M run   286:51  0.25%  1.64% hciengine
757876 hci       20    0    43M   23M run   198:13  0.18%  1.62% hciengine
618724 hci       20    0    52M   31M run   293:01  0.26%  1.47% hciengine
291138 hci       20    0  1372K  776K run   788:54  0.70%  1.45% lm
1482756 hci       20    0  1284K 1140K run   267:00  0.24%  1.42% lm
942236 hci       20    0    41M   22M run    67:33  0.06%  1.23% hciengine
4005968 hci       20    0    82M   82M run    26.1H 18.51%  1.20% java
1237224 hci       20    0  1392K  768K run   761:26  0.68%  1.18% lm
475536 hci       20    0    37M   21M run    99:29  0.09%  1.11% hciengine
487666 hci       20    0    37M   21M run    89:10  0.08%  1.07% hciengine
639210 hci       20    0    36M   21M run    81:50  0.07%  1.07% hciengine
753806 hci       20    0    37M   21M run    85:16  0.08%  1.05% hciengine
4075716 hci       20    0    34M   24M run     8:05  0.10%  1.05% hciengine
3973140 hci       20    0    50M   40M run     6:45  0.08%  1.05% hciengine
311652 hci       20    0    36M   21M run    96:34  0.09%  1.03% hciengine
991442 hci       20    0    34M   19M run    58:00  0.05%  1.01% hciengine
455048 hci       20    0    36M   21M run   134:21  0.12%  1.00% hciengine

Here is the output from the commands suggested. I'm not familiar with most of these statistics/settings, so hopefully someone will be nice enough to explain. :-)

Code:
# vmo -a
     ame_cpus_per_pool = n/a
       ame_maxfree_mem = n/a
   ame_min_ucpool_size = n/a
       ame_minfree_mem = n/a
       ams_loan_policy = n/a
   force_relalias_lite = 0
     kernel_heap_psize = 65536
          lgpg_regions = 0
             lgpg_size = 0
       low_ps_handling = 1
               maxfree = 1088
               maxperm = 5466626
                maxpin = 5072240
               maxpin% = 80
         memory_frames = 6291456
         memplace_data = 2
  memplace_mapped_file = 2
memplace_shm_anonymous = 2
    memplace_shm_named = 2
        memplace_stack = 2
         memplace_text = 2
memplace_unmapped_file = 2
               minfree = 960
               minperm = 182218
              minperm% = 3
             nokilluid = 0
               npskill = 24576
               npswarn = 98304
             numpsblks = 3145728
       pinnable_frames = 5034858
   relalias_percentage = 0
                 scrub = 0
              v_pinshm = 0
      vmm_default_pspa = 0
    wlm_memlimit_nonpg = 1

Code:
# ioo -a
                    aio_active = 0
                   aio_maxreqs = 65536
                aio_maxservers = 30
                aio_minservers = 3
         aio_server_inactivity = 300
         j2_atimeUpdateSymlink = 0
 j2_dynamicBufferPreallocation = 256
             j2_inodeCacheSize = 400
           j2_maxPageReadAhead = 128
             j2_maxRandomWrite = 0
          j2_metadataCacheSize = 400
           j2_minPageReadAhead = 2
j2_nPagesPerWriteBehindCluster = 32
             j2_nRandomCluster = 0
              j2_syncPageCount = 0
              j2_syncPageLimit = 16
                    lvm_bufcnt = 9
                    maxpgahead = 8
                    maxrandwrt = 0
                      numclust = 1
                     numfsbufs = 1024
                     pd_npages = 65536
              posix_aio_active = 0
             posix_aio_maxreqs = 65536
          posix_aio_maxservers = 30
          posix_aio_minservers = 3
   posix_aio_server_inactivity = 300

Code:
# schedo -a
         affinity_lim = 7
        big_tick_size = 1
ded_cpu_donate_thresh = 80
     fixed_pri_global = 0
            force_grq = 0
              maxspin = 16384
             pacefork = 10
      proc_disk_stats = 1
              sched_D = 16
              sched_R = 16
        tb_balance_S0 = 2
        tb_balance_S1 = 2
         tb_threshold = 100
            timeslice = 1
      vpm_fold_policy = 1
           vpm_xvcpus = 0

Code:
# no -a
                 arpqsize = 12
               arpt_killc = 20
              arptab_bsiz = 7
                arptab_nb = 149
                bcastping = 0
      clean_partial_conns = 0
                 delayack = 0
            delayackports = {}
         dgd_packets_lost = 3
            dgd_ping_time = 5
           dgd_retry_time = 5
       directed_broadcast = 0
                 fasttimo = 200
        icmp6_errmsg_rate = 10
          icmpaddressmask = 0
ie5_old_multicast_mapping = 0
                   ifsize = 256
               ip6_defttl = 64
                ip6_prune = 1
            ip6forwarding = 1
       ip6srcrouteforward = 1
       ip_ifdelete_notify = 0
                 ip_nfrag = 200
             ipforwarding = 0
                ipfragttl = 2
        ipignoreredirects = 0
                ipqmaxlen = 100
          ipsendredirects = 1
        ipsrcrouteforward = 1
           ipsrcrouterecv = 1
           ipsrcroutesend = 1
          llsleep_timeout = 3
                  lo_perf = 1
                lowthresh = 90
                 main_if6 = 0
               main_site6 = 0
                 maxnip6q = 20
                   maxttl = 255
                medthresh = 95
               mpr_policy = 1
              multi_homed = 1
                nbc_limit = 3145728
            nbc_max_cache = 131072
            nbc_min_cache = 1
         nbc_ofile_hashsz = 12841
                 nbc_pseg = 0
           nbc_pseg_limit = 6291456
           ndd_event_name = {all}
        ndd_event_tracing = 0
            ndp_mmaxtries = 3
            ndp_umaxtries = 3
                 ndpqsize = 50
                ndpt_down = 3
                ndpt_keep = 120
               ndpt_probe = 5
           ndpt_reachable = 30
             ndpt_retrans = 1
             net_buf_size = {all}
             net_buf_type = {all}
     net_malloc_frag_mask = {0}
        netm_page_promote = 1
           nonlocsrcroute = 1
                 nstrpush = 8
              passive_dgd = 0
         pmtu_default_age = 10
              pmtu_expire = 10
 pmtu_rediscover_interval = 30
              psebufcalls = 20
                 psecache = 1
                psetimers = 20
           rfc1122addrchk = 0
                  rfc1323 = 1
                  rfc2414 = 1
             route_expire = 1
          routerevalidate = 1
                 rto_high = 64
               rto_length = 13
                rto_limit = 7
                  rto_low = 1
                     sack = 0
                   sb_max = 1310720
       send_file_duration = 300
              site6_index = 0
               sockthresh = 85
                  sodebug = 0
              sodebug_env = 0
                somaxconn = 1024
                 strctlsz = 1024
                 strmsgsz = 0
                strthresh = 85
               strturncnt = 15
          subnetsarelocal = 1
       tcp_bad_port_limit = 0
                  tcp_ecn = 0
       tcp_ephemeral_high = 65535
        tcp_ephemeral_low = 32768
             tcp_finwait2 = 1200
           tcp_icmpsecure = 0
          tcp_init_window = 0
    tcp_inpcb_hashtab_siz = 24499
              tcp_keepcnt = 8
             tcp_keepidle = 14400
             tcp_keepinit = 150
            tcp_keepintvl = 150
     tcp_limited_transmit = 1
              tcp_low_rto = 0
             tcp_maxburst = 0
              tcp_mssdflt = 1460
          tcp_nagle_limit = 65535
        tcp_nagleoverride = 0
               tcp_ndebug = 100
              tcp_newreno = 1
           tcp_nodelayack = 0
        tcp_pmtu_discover = 1
            tcp_recvspace = 655360
            tcp_sendspace = 655360
            tcp_tcpsecure = 0
             tcp_timewait = 1
                  tcp_ttl = 60
           tcprexmtthresh = 3
             tcptr_enable = 0
                  thewall = 12582912
         timer_wheel_tick = 0
                tn_filter = 1
       udp_bad_port_limit = 0
       udp_ephemeral_high = 65535
        udp_ephemeral_low = 32768
    udp_inpcb_hashtab_siz = 24499
        udp_pmtu_discover = 1
            udp_recvspace = 42080
            udp_sendspace = 9216
                  udp_ttl = 30
                 udpcksum = 1
           use_sndbufpool = 1


To me it looks like the I/Os Blocked with no PBuf is not too high. This server has been running since May 16th. The one thing I'm not too sure about is the min/max tunables pin, perm and client. I think we left these at the default settings when AIX was installed.
Code:
# vmstat -v
              6291456 memory pages
              6074032 lruable pages
                43335 free pages
                    5 memory pools
              1256482 pinned pages
                 80.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                 35.9 numperm percentage
              2181871 file pages
                  0.0 compressed percentage
                    0 compressed pages
                 35.9 numclient percentage
                 90.0 maxclient percentage
              2181871 client pages
                    0 remote pageouts scheduled
                   32 pending disk I/Os blocked with no pbuf
                80636 paging space I/Os blocked with no psbuf
                 2484 filesystem I/Os blocked with no fsbuf
                    0 client filesystem I/Os blocked with no fsbuf
                 1737 external pager filesystem I/Os blocked with no fsbuf

The disk activity is not all that high MOST of the time. We do have some archiving that happens 4 times a day, however, even then I don't see much IO Wait going on. It does seem like Disk0 and Disk1 are the busiest. This server is attached to a SAN, so there are many virtual hdisks.

Code:
# iostat 5 |grep -v "0.0"
System configuration: lcpu=40 drives=42 paths=2 vdisks=0
 
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk29          0.2      39.2       9.8          0       196
hdisk40          0.2      19.2       4.8          0        96
hdisk4           0.6      36.8       9.2          0       184
hdisk6           0.2      49.6      12.4          0       248
hdisk11          0.2      56.8      14.2          0       284
hdisk13          0.2      34.4       8.6          0       172
hdisk18          0.2      32.0       5.4          0       160
hdisk24          0.4      36.0       9.0          0       180
hdisk31          0.6      43.2      10.8          0       216
hdisk0          25.2     358.3      89.6          0      1792
hdisk1          29.0     358.3      89.6          0      1792
 
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk29          0.4      19.2       4.8          0        96
hdisk35          0.2      19.2       4.6          0        96
hdisk38          0.2      24.8       6.2          0       124
hdisk40          0.2      29.6       6.2          0       148
hdisk4           0.4      27.2       6.6          0       136
hdisk6           0.6      30.4       7.6          0       152
hdisk9           0.4      17.6       4.4          0        88
hdisk11          0.2      35.2       8.6          0       176
hdisk18          0.2      16.8       4.2          0        84
hdisk26          0.6      31.2       7.8          0       156
hdisk31          0.8      30.4       7.6          0       152
hdisk0          18.8     244.6      61.2          0      1224
hdisk1          18.8     244.6      61.2          0      1224
 
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk35          0.6      24.8       5.8          0       124
hdisk40          0.8      29.6       7.0          0       148
hdisk4           0.4      32.8       8.2          0       164
hdisk6           1.2      49.6      12.4          0       248
hdisk11          0.2      60.8      15.2          0       304
hdisk15          0.2      34.4       8.2          0       172
hdisk20          0.4      28.8       6.6          0       144
hdisk24          0.2      38.4       9.6          0       192
hdisk31          0.8      62.4      15.6          0       312
hdisk0          26.0     351.1      87.8          0      1756
hdisk1          29.6     351.9      88.0          0      1760
 
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk29          0.8      51.2      12.4          0       256
hdisk35          0.2      21.6       5.0          0       108
hdisk38          0.2      32.8       8.0          0       164
hdisk40          0.2      31.2       7.6          0       156
hdisk4           0.8      39.2       9.6          0       196
hdisk6           0.6      48.8      12.2          0       244
hdisk7           0.2       0.2       0.4          1         0
hdisk9           0.4      43.2      10.6          0       216
hdisk11          0.8      57.6      13.8          0       288
hdisk13          0.2      33.6       8.0          0       168
hdisk15          0.4      28.8       7.0          0       144
hdisk18          0.2      21.6       5.2          0       108
hdisk20          0.6      29.6       7.0          0       148
hdisk26          0.4      40.8      10.2          0       204
hdisk33          0.2      38.4       5.8          0       192
hdisk24          1.2      39.2       9.8          0       196
hdisk31          0.4      45.6      11.4          0       228
hdisk0          18.8     252.6      63.1          0      1264
hdisk1          22.0     251.0      62.7          0      1256

Again, I'm do not know how to read the output of this command, but here it is in case anyone can help.
Code:
# ipcs -m
IPC status from /dev/mem as of Mon Aug  2 08:33:07 CDT 2010
T        ID     KEY        MODE       OWNER    GROUP
Shared Memory:
m   1048576 0x0d000944 --rw-rw----     root   system
m   1048577 0x7800000d --rw-rw-rw-     root   system
m   1048578 0x7800000c --rw-rw-rw-     root   system
m   1048579 0x700020a1 --rw-------     root   system
m   1048580 0x680020a1 --rw-r--r--     root   system
m   1048581 0x670020a1 --rw-r--r--     root   system
m   2097158 0x2604f589 --rw-rw-rw-      hci    staff
m         7 0x4401a2a5 --rw-rw-rw-      hci    staff
m         8 0x5001f9b2 --rw-rw-rw-      hci    staff
m         9 0x4b00afb2 --rw-rw-rw-      hci    staff
m        10 0x3c01bae4 --rw-rw-rw-      hci    staff
m        11 0x3f01e32d --rw-rw-rw-      hci    staff
m        12 0x3201bcf6 --rw-rw-rw-      hci    staff
m        13 0x6a01c7d7 --rw-rw-rw-      hci    staff
m        14 0x6901c836 --rw-rw-rw-      hci    staff
m        15 0x7801c198 --rw-rw-rw-      hci    staff
m        16 0x7b01c81e --rw-rw-rw-      hci    staff
m        17 0x460198e3 --rw-rw-rw-      hci    staff
m        18 0x22019b61 --rw-rw-rw-      hci    staff
m        19 0x5c01d841 --rw-rw-rw-      hci    staff
m        20 0x4001b0a2 --rw-rw-rw-      hci    staff
m        21 0x3501d208 --rw-rw-rw-      hci    staff
m        22 0x5901e867 --rw-rw-rw-      hci    staff
m        23 0x3301eba7 --rw-rw-rw-      hci    staff
m        24 0x470193fa --rw-rw-rw-      hci    staff
m        25 0x5301d575 --rw-rw-rw-      hci    staff
m        26 0x4501e494 --rw-rw-rw-      hci    staff
m        27 0x2801f56a --rw-rw-rw-      hci    staff
m        28 0x3401e5cf --rw-rw-rw-      hci    staff
m        29 0x4501974d --rw-rw-rw-      hci    staff
m        30 0x4601f5c8 --rw-rw-rw-      hci    staff
m   2097183 0x44019293 --rw-rw-rw-      hci    staff
m   2097184 0x5001d6e0 --rw-rw-rw-      hci    staff

Thanks for the help and suggestions!

Thanks!
Troy
 

10 More Discussions You Might Find Interesting

1. High Performance Computing

Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris

Provides a description of how to set up a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris. More... (0 Replies)
Discussion started by: Linux Bot
0 Replies

2. High Performance Computing

SUN Cluster Vs Veritas Cluster

Dear All, Can anyone explain about Pros and Cons of SUN and Veritas Cluster ? Any comparison chart is highly appreciated. Regards, RAA (4 Replies)
Discussion started by: RAA
4 Replies

3. AIX

Configuring new disks on AIX cluster

We run two p5 nodes running AIX 5L in a cluster mode (HACMP), both the nodes share external disk arrays. Only the primary node can access the shared disks at a given point of time. We are in the process of adding two new disks to the disk arrays so as to make them available to the existing... (3 Replies)
Discussion started by: dnicky
3 Replies

4. AIX

Breaking AIX cluster

Hello All, I was just wondering: How do I break a server cluster in an AIX 5.2 environment? Thanks. (1 Reply)
Discussion started by: bbbngowc
1 Replies

5. AIX

Aix hacmp cluster question (oracle & sap)

Hello, I was wondering if I have 3 nodes (A, B, C) all configured to startup with HACMP, but I would like to configure HACMP in such a way: 1) Node B should startup first. After the cluster successfully starts up and mounts all the filesystems, then 2) Node A, and Node C should startup ! ... (4 Replies)
Discussion started by: filosophizer
4 Replies

6. Linux

i want to install linux on my computer, but worried about viruses

hi guys, I am new to linux. I want to install it on my home computer. I have a few questions. 1) if an exploit is found on linux, how long is it before it gets patched up? My worry is that because there are not many linux users, if a big is found, then it will be a long time before others... (4 Replies)
Discussion started by: JamesByars
4 Replies

7. Solaris

Sun cluster and Veritas cluster question.

Yesterday my customer told me to expect a vcs upgrade to happen in the future. He also plans to stop using HDS and move to EMC. Am thinking how to migrate to sun cluster setup instead. My plan as follows leave the existing vcs intact as a fallback plan. Then install and build suncluster on... (5 Replies)
Discussion started by: sparcguy
5 Replies

8. Solaris

Sun cluster 4.0 - zone cluster failover doubt

Hello experts - I am planning to install a Sun cluster 4.0 zone cluster fail-over. few basic doubts. (1) Where should i install the cluster s/w binaries ?. ( global zone or the container zone where i am planning to install the zone fail-over) (2) Or should i perform the installation on... (0 Replies)
Discussion started by: NVA
0 Replies

9. AIX

[Howto] Update AIX in HACMP cluster-nodes

As i have updated a lot of HACMP-nodes lately the question arises how to do it with minimal downtime. Of course it is easily possible to have a downtime and do the version update during this. In the best of worlds you always get the downtime you need - unfortunately we have yet to find this best of... (4 Replies)
Discussion started by: bakunin
4 Replies

10. AIX

AIX Cluster Show shared file systems.

Hello, I am working on applications on an AIX 6.1 two-node cluster, with an active and passive node. Is there a command that will show me which mount points / file systems are shared and 'swing' from one node to the other when the active node changes, and which mount points are truly local to... (6 Replies)
Discussion started by: Clovis_Sangrail
6 Replies
All times are GMT -4. The time now is 02:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy