Accepting mail from batch is slow


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Accepting mail from batch is slow
# 1  
Old 07-15-2016
Accepting mail from batch is slow

I have an opportunity to improve the performance of batch processing that sends out thousands of e-mails per day to inform customers (yes, real people spending money!) that their order has been picked, packed, dispatched or whatever. The logic in the code that notifies them works on blocks of customer orders. It fires e-mails as it detects processing is moving on and waits to confirm that the email has been accepted by our internal mail-relay server. There are multiple queues of work being processed so there can be multiple email requests being sent to the mail relay server at the same time. The batch waiting is a business requirement and I'm not sure I can change that.

Both the application server and the mail-relay server are CentOS 6

The problem I have seems to be on the mail relay side. According to the logs (and I'm not sure I'm reading them correctly) the records per email transaction is as follows:-
Code:
Jun 19 13:17:31 client=unknown[application server IP]
Jun 19 13:17:54 message-id=<2082491272.62178179332185502.JavaMail.wasuser@server.with.our.internal.suffix>
Jun 19 13:17:54 warning:
Jun 19 13:17:54 from=<orders@my-company-email.domain>,
Jun 19 13:17:55 enabling
Jun 19 13:17:56 to=<customer-email@email.provider.domain>,
Jun 19 13:17:56 removed

This sanitised records is an extract from /var/log/maillog and is fields 1, 2, 3 & 7 based on field 6 being a unique reference number, in this case EA866605CA I've grabbed only these fields as the rest are markers such as what wrote the message etc. I can provide the full (sanitised) output if that is needed but I'm hoping it's not.

The problem I see is that only about 30% of the mail is being processed within 1 second. The above example has a delay of 23 seconds between the first two records. As a cumulative frequency graph, after 30% getting processed within 1 second (or actually under 2 seconds I guess) to get to 31% we have to count everything under 18 seconds. Half is accepted under 22 seconds and 98% under 28 seconds. The volumes don't seem to matter if this is a few days worth of logs, or tied to a specific hour - it's a 24-hour operation so there are no real quiet periods.

The mail is sporadic and I can find no pattern between multiple requests and a slow-down at all. Some running by themselves are slow and some running in parallel fly through.

I have been digging to find what I would consider a normal performance bottleneck or contention (CPU is 88% idle at worst, swap is unused, eth0 is ca. 20Kbps for both in- and out-bound etc.) and I'm very confused where to look next SmilieSmilieSmilie

There are various delays on actually sending the email out of the company, but I think that the application will have carried on by that point, so I'm not too worried about that side. Sadly, the server has been build with the minimum two single filesystems ( / & /boot ) which isn't great but it's not full (currently 46% of 7.5Gb used)

Can anyone suggest where I should be looking next? I would very much like a simple guide because I've only worked with mail as either a client or defining which server is the relay before.


Many thanks, in advance,
Robin

Last edited by rbatte1; 07-15-2016 at 10:55 AM.. Reason: Spelling
# 2  
Old 07-15-2016
A DNS problem, mayhap? Timing out when target name not found? Is it always the same sender and/or recipient and/or application server?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 07-15-2016
Thank you for your thoughts RudiC,

I can confirm that it is always the same sender because we force the sender name.

We have considered DNS, but the frequency seems too high to be mistakes so the mail won't deliver (besides we just need to get it into the queue) and if it was a general problem I wouldn't think we should get any processed really quickly. The nearest I have on DNS is that the mail is from application generating the mail is an active-active cluster and what appears in the log is the boot IP address sending mail in, not a DNS name. Perhaps it is trying to reverse lookup, but the time is not consistent so leads me away from that being the issue. DNS query timeout has not yet been adjusted as people are not keen to fiddle with it, but it's still an option if we can be fairly certain.




Kind regards,
Robin
# 4  
Old 07-15-2016
What made me suspicious is the client=unknown in direct combination with the 23 sec delay. Is that a syslog excerpt? Could you post a good (full) example along with a bad one, best from the same originator?
# 5  
Old 07-15-2016
Searching the net for client=unknown, I found this (in German), this, and the postfix.org site saying
Quote:
If Postfix logs the SMTP client as "unknown" then you have a name service problem: the name server is bad, or the resolv.conf file contains bad information, or some packet filter is blocking the DNS requests or replies.
BTW, are you running postfix?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Client was not authenticated to send anonymous mail during MAIL FROM (in reply to MAIL FROM comm

I am having trouble getting mail to work on a red hat server. At first I was getting this message. Diagnostic-Code: X-Postfix; delivery temporarily suspended: connect to :25: Connection refused Then added the port to my firewall. Then I temporarily turned off selinux. I then copied this file... (1 Reply)
Discussion started by: cokedude
1 Replies

2. Web Development

Gmail very slow to receive Sendmail/PHP Mail

Okay, I have searched the forums and couldn't really find a topic on this, so I decided to start one. But I decided to start a ncurses discussion forum recently and one thing I noticed while getting it going is that when mail is sent out via Sendmail or PHP Mail, Gmail is /very/ slow to receive it.... (2 Replies)
Discussion started by: Phobos D'thorga
2 Replies

3. Shell Programming and Scripting

Executing a batch of files within a shell script with option to refire the individual files in batch

Hello everyone. I am new to shell scripting and i am required to create a shell script, the purpose of which i will explain below. I am on a solaris server btw. Before delving into the requirements, i will give youse an overview of what is currently in place and its purpose. ... (2 Replies)
Discussion started by: goddevil
2 Replies

4. Shell Programming and Scripting

script accepting password

Hi friends, I am very new to Unix scripting and having some difficulty in my first shell script. I have written a simple shell script to upload an artifact to a remote machine on the network. echo "Uploading the artifact" scp app.war username@remotemochine.domainname.net:/home/deployables... (3 Replies)
Discussion started by: prashdeep
3 Replies

5. UNIX for Advanced & Expert Users

need to configure mail setting to send mail to outlook mail server

i have sun machines having solaris 9 & 10 OS . Now i need to send mail from the machines to my outlook account . I have the ip adress of OUTLOOK mail server. Now what are the setting i need to do in solaris machines so that i can use mailx or sendmail. actually i am trying to automate the high... (2 Replies)
Discussion started by: amitranjansahu
2 Replies

6. Shell Programming and Scripting

mail too slow on solaris

Hi, I am using /usr/ucb/mail for mailing thru script. But mail reaches after 30-40 mins. What can be the reason and also how to check status and related info (1 Reply)
Discussion started by: Deei
1 Replies

7. Shell Programming and Scripting

Accepting A-Za-Z

Is there a way accept A-Za-z0-9 from the user from a parameter? EX. I want to take the parameter from the user even if its hEu or H3y and store it as a parameter ( $1 ) (18 Replies)
Discussion started by: puttster
18 Replies

8. UNIX for Dummies Questions & Answers

accepting input date

I how do i accept a input date in script which is lesser than a specified day? ex: to accept a date less than or equal to 100 days(from today).?:( Thanks for the help in advance.:) (1 Reply)
Discussion started by: abhi_123
1 Replies

9. UNIX and Linux Applications

e-mail is slow

The problem is between 2:00 and 5:00 PM. The email (procmail) proccess load up big time. If I do a prstat at 8:00 AM I have about 100 proccess and 30 mails in mailq. Now at 3:50 PM, I have over 1000 proccess from prstat and 600 e-mail in mailq. Some mail takes 2 hours to get out or in. ... (1 Reply)
Discussion started by: photon
1 Replies

10. HP-UX

Oracle not accepting new connections

Hi UNIX guru's, Have recently upgraded Oracle from 8i to 10g on an HP-UX (RISC) 11.11 box. At least twice a day the database stops accepting incoming connections and the following errors are observed in the various logs. The box needs to be rebooted to get everything going again. The... (4 Replies)
Discussion started by: mat_cottrell
4 Replies
Login or Register to Ask a Question