Nearly Random, Uncorrelated Server Load Average Spikes - Page 2

Login or Register to Ask a Question and Join Our Community

Nearly Random, Uncorrelated Server Load Average Spikes

Tags

advanced, bots, chinese bots, cron, iptables, lamp, load average, member_project, mqtt, myisam, mysql, mysqld, mysqldumpslow, node-red, raid1, rouge bots, server, server load, ubuntu

Login to Discuss or Reply to this Discussion in Our Community

Top Forums UNIX for Advanced & Expert Users Nearly Random, Uncorrelated Server Load Average Spikes

02-13-2020

Moderator

6,876, 694

Join Date: Sep 2005

Last Activity: 10 February 2021, 3:50 AM EST

Location: Switzerland - GE

Posts: 6,876

Thanks Given: 594

Thanked 694 Times in 627 Posts

Code:

[5885167.576271] TCP: request_sock_TCP: Possible SYN flooding on port 443. Sending cookies.  Check SNMP counters.
[5942225.927974] r8169 0000:01:00.0 enp1s0: link down
[5942286.125907] r8169 0000:01:00.0 enp1s0: link up
[6100421.130628] TCP: request_sock_TCP: Possible SYN flooding on port 443. Sending cookies.  Check SNMP counters. Did you find anything here?
[6848807.673874] DCCP: Activated CCID 2 (TCP-like)
[6848807.681997] sctp: Hash tables configured (bind 1024/1024)
[8210127.728955] md: data-check of RAID array md0
[8210127.742698] md: delaying data-check of md1 until md0 has finished (they share one or more physical units) Due to your check?
[8210127.780876] md: delaying data-check of md2 until md1 has finished (they share one or more physical units)
[8210130.257361] md: md0: data-check done.
[8210130.260788] md: data-check of RAID array md1
[8210170.116940] md: md1: data-check done.
[8210170.121703] md: data-check of RAID array md2
[8212579.951548] md: md2: data-check done.

vbe

View Public Profile for vbe

Find all posts by vbe

02-13-2020

Moderator

6,876, 694

Join Date: Sep 2005

Last Activity: 10 February 2021, 3:50 AM EST

Location: Switzerland - GE

Posts: 6,876

Thanks Given: 594

Thanked 694 Times in 627 Posts

How are the disks attached ? a NAS, a SAN? what type?
But usually this sort of issues comes more from the OS side... Or SAN is flushing and syncing its cache but badly configured, not optimised to your usage (dont laugh I have seen cases with the best equipment...)

vbe

View Public Profile for vbe

Find all posts by vbe

02-13-2020

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

Quote:

Originally Posted by vbe

you tried

Code:

sar -b 5 5

or more adequate value?

That is a good idea.

I'll give some combination of sar a try, maybe I'll write a little script when the load goes over 50 to kick of something like:

Code:

sar --human -d 5 5

or some variation of the above.

Good idea.

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

02-13-2020

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

Quote:

Originally Posted by vbe

How are the disks attached ? a NAS, a SAN? what type?
But usually this sort of issues comes more from the OS side... Or SAN is flushing and syncing its cache but badly configured, not optimised to your usage (dont laugh I have seen cases with the best equipment...)

The SSD drives are in the box.

Code:

ubuntu# dmesg | grep disk
[    1.651235] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.651311] sd 1:0:0:0: [sdb] Attached SCSI disk

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

02-13-2020

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

Quote:

Originally Posted by vbe

Code:

[5885167.576271] TCP: request_sock_TCP: Possible SYN flooding on port 443. Sending cookies.  Check SNMP counters.
[5942225.927974] r8169 0000:01:00.0 enp1s0: link down
[5942286.125907] r8169 0000:01:00.0 enp1s0: link up
[6100421.130628] TCP: request_sock_TCP: Possible SYN flooding on port 443. Sending cookies.  Check SNMP counters. Did you find anything here?
[6848807.673874] DCCP: Activated CCID 2 (TCP-like)
[6848807.681997] sctp: Hash tables configured (bind 1024/1024)
[8210127.728955] md: data-check of RAID array md0
[8210127.742698] md: delaying data-check of md1 until md0 has finished (they share one or more physical units) Due to your check?
[8210127.780876] md: delaying data-check of md2 until md1 has finished (they share one or more physical units)
[8210130.257361] md: md0: data-check done.
[8210130.260788] md: data-check of RAID array md1
[8210170.116940] md: md1: data-check done.
[8210170.121703] md: data-check of RAID array md2
[8212579.951548] md: md2: data-check done.

Based on prior experience, a SYN Flood attack msg in dmesg is a fraction of the traffic the site has (it's noise), so I don't think that is an issue (it's noise, I think... not "signal" .... ).

There is no I/O spike (network I/O) as mentioned earlier (in case you missed it).

There is zero correlation between network I/0 and the load spike.

I do not think it is network I/O related.

The site gets tons of bot traffic from wayward bots globally, and there would be a correlation, but there is also no correlation between bots, network i/o, etc. None.

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

02-13-2020

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

FYI, there has been no spike in the past 8-9 hours (my time):

Nearly Random, Uncorrelated Server Load Average Spikes-screen-shot-2020-02-13-34939-pmjpg

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

02-13-2020

Moderator

6,876, 694

Join Date: Sep 2005

Last Activity: 10 February 2021, 3:50 AM EST

Location: Switzerland - GE

Posts: 6,876

Thanks Given: 594

Thanked 694 Times in 627 Posts

OK so it is more now I have read again your first and second post, related to the MySQL engine: Do you have a MyISAM engine running too?
What most admins I work with don't understand is the recommendations often given by vendors on tuning their soft are based on standard configurations and don't really apply when out of that.
What you see to me is the backside effect of big SGA that could be more efficient when size is reduced since the SAN bays have huge cache too... Either it is loosing its time parsing the storage now nearly full either the storage is too fragmented

vbe

View Public Profile for vbe

Find all posts by vbe

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

ESP32 (ESP-WROOM-32) as an MQTT Client Subscribed to Linux Server Load Average Messages

Here we go.... Preface: ..... so in a galaxy far, far, far away from commercial, data sharing corporations..... For this project, I used the ESP-WROOM-32 as an MQTT (publish / subscribe) client which receives Linux server "load averages" as messages published as MQTT pub/sub messages....

2. UNIX for Dummies Questions & Answers

Help with load average?

how load average is calculated and what exactly is it difference between cpu% and load average

3. UNIX for Dummies Questions & Answers

Load average spikes once an hour

Hi, I am getting a high load average, around 7, once an hour. It last for about 4 minutes and makes things fairly unusable for this time. How do I find out what is using this. Looking at top the only thing running at the time is md5sum. I have looked at the crontab and there is nothing...

4. Solaris

Load Average and Lwps

NPROC USERNAME SWAP RSS MEMORY TIME CPU 320 oracle 23G 22G 69% 582:55:11 85% 47 root 148M 101M 0.3% 99:29:40 0.3% 53 rafmsdb 38M 60M 0.2% 0:46:17 0.1% 1 smmsp 1296K 5440K 0.0% 0:00:08 0.0% 7 daemon ...

5. UNIX for Advanced & Expert Users

Load average in UNIX

Hi , I am using 48 CPU sunOS server at my work. The application has facility to check the current load average before starting a new process to control the load. Right now it is configured as 48. So it does mean that each CPU can take maximum one proces and no processe is waiting. ...

6. UNIX for Dummies Questions & Answers

Please Help me in my load average

Hello AlL,.. I want from experts to help me as my load average is increased and i dont know where is the problem !! this is my top result : root@a4s # top top - 11:30:38 up 40 min, 1 user, load average: 3.06, 2.49, 4.66 Mem: 8168788k total, 2889596k used, 5279192k free, 47792k...

7. Solaris

load average query.

Hi, i have installed solaris 10 on t-5120 sparc enterprise. I am little surprised to see load average of 2 or around on this OS. when checked with ps command following process is using highest CPU. looks like it is running for long time and does not want to stop, but I do not know...

8. UNIX for Dummies Questions & Answers

top - Load average

Hello, Here is the output of top command. My understanding here is, the load average 0.03 in last 1 min, 0.02 is in last 5 min, 0.00 is in last 15 min. By seeing this load average, When can we say that, the system load averge is too high? When can we say that, load average is medium/low??...

9. UNIX for Dummies Questions & Answers

Load Average

Hello all, I have a question about load averages. I've read the man pages for the uptime and w command for two or three different flavors of Unix (Red Hat, Tru64, Solaris). All of them agree that in the output of the 2 aforementioned commands, you are given the load average for the box, but...

10. UNIX for Advanced & Expert Users

load average

we have an unix system which has load average normally about 20. but while i am running a particular unix batch which performs heavy operations on filesystem and database average load reduces to 15. how can we explain this situation? while running that batch idle cpu time is about %60-65...

Login or Register to Ask a Question