Ha mailserver, is possible active/active with "constant" connection?

Login or Register to Ask a Question and Join Our Community

Ha mailserver, is possible active/active with "constant" connection?

Tags

advanced, haproxy traffic, keepalived, solved

Login to Discuss or Reply to this Discussion in Our Community

Top Forums UNIX for Advanced & Expert Users Ha mailserver, is possible active/active with "constant" connection?

04-04-2020

Registered User

235, 12

Join Date: Jun 2011

Last Activity: 18 May 2020, 2:12 AM EDT

Posts: 235

Thanks Given: 29

Thanked 12 Times in 11 Posts

Ha mailserver, is possible active/active with "constant" connection?

I have setup a mail server, for testing.
My goal is to have a HA mailserver with imaps, when a client connect to a virtual ip, it redirect to two real servers, if a real server crash the other real server "take" the connection.
I have setup a cluster with two keepalived/haproxy lb and two real servers with postfix and Dovecot.The two lb are Debian, the mail servers are Fedora 31.
This is my configuration, on the two lb(load balancers)

Keepalived.conf

Code:

global_defs {
    }
    vrrp_instance VI_1 {
           interface nm-team
           state MASTER
           virtual_router_id 51
           priority 101                    # 101 on master, 100 on backup
           advert_int 1
           smtp_alert
    authentication {
    auth_type PASS
    auth_pass mypass
    
    }
    }
    
           virtual_ipaddress {
               10.2.0.4/24 brd 10.2.0.255 dev nm-team
    }
    
     virtual_server 10.2.0.4 25 {
       delay_loop 30
       lb_algo rr
       lb_kind DR
       protocol TCP
       persistence_timeout 360
    
       real_server 10.2.0.5 25 {
       weight 1
           TCP_CHECK {
                   connect_timeout 10
           connect_port 25
           delay_before_retry 3
           }
       }
       real_server 10.2.0.6 25 {
           weight 1
           TCP_CHECK {
                   connect_timeout 10
           connect_port 25
           delay_before_retry 3
           }
       }
    }
    
    virtual_server 10.2.0.4 993 {
    delay_loop 30
    lb_algo rr
    lb_kind DR
    protocol TCP
    persistence_timeout 360
    
    real_server 10.2.0.5 993 {
    weight 1
        TCP_CHECK {
                connect_timeout 10
        connect_port 993
        nb_get_retry 3
        delay_before_retry 3
        }
    }
    real_server 10.2.0.6 993 {
        weight 1
        TCP_CHECK {
                connect_timeout 10
        connect_port 993
        nb_get_retry 3
        delay_before_retry 3
        }
    }
    }

haproxy.cfg

Code:

  global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
    
        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private
    
        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
        # An alternative list with additional directives can be obtained from
        #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3
    
    defaults
        log    global
        mode    tcp
    
    #postfix
    listen smtp
    bind mail.mydomain.priv:25
    balance roundrobin
    timeout client 30s
    timeout connect 10s
    timeout server 1m
    no option http-server-close
    mode tcp
    option smtpchk
    option tcplog
    server mail1 mail1.mydomain.priv:25 send-proxy
    server mail2 mail2.mydomain.priv:25 send-proxy
    
    #dovecot
    listen imap
    bind mail.mydomain.priv:993
    timeout client 30s
    timeout connect 10s
    timeout server 1m
    no option http-server-close
    balance leastconn
    stick store-request src
    stick-table type ip size 200k expire 30m
    mode tcp
    option tcplog
    server mail1 mail1.mydomain.priv:993 send-proxy
    server mail2 mail2.mydomain.priv:993 send-proxy

As you can see, the mail.domain.priv is the "virtual" server
binded to virtual ip 10.2.0.4(created by keepalived), the real
servers are 10.2.0.5 and 10.2.0.6.
The virtual ip 10.2.0.4 is alias to lo interface, I have created it
with those lines, in the lb

Code:

    ip addr add 10.2.0.4/32 dev lo label lo:0

and in the real servers

Code:

    echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore
    echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce
    ip addr add 10.2.0.4/32 dev lo label lo:0

I skip to post the dovecot/postfix configuration because is
too long, but I have tested it and works fine, as single
server and with the 10.2.0.4 virtual ip.
Of course the real server has the /var/vmail/mydomain shared
using glusterfs(I know is slow, but is only for testing).
I have connected a client, and I can get emails with dovecot
and send emails with postfix using imaps and smtp with starttls
without any problem.
So, what is the problem?
I have tested the cluster shutting down one of the real servers
with a client open(Thunderbird), and the client is "freeze", as
cluster don't exist and cannot read emails.
If I kill the client(Thunderbird), and restart it, it reconnect without problems
to 10.2.0.4 virtual ip(mail.mydomain.priv).
What is wrong?
Is possible to create an ha cluster active/active using keepalived
and haproxy?

Linusolaradm1

View Public Profile for Linusolaradm1

Find all posts by Linusolaradm1

04-04-2020

Moderator

2,327, 710

Join Date: Feb 2012

Last Activity: 3 May 2020, 3:12 AM EDT

Location: Devon, UK

Posts: 2,327

Thanks Given: 442

Thanked 710 Times in 578 Posts

You problem is with network timeout settings; either the cluster or the clients.

Manually shutting down one of the cluster nodes may not give you the same result as a true CPU/power/whatever failure because the cluster software suite will probably see you do that. It would be better to simply pull out the RJ45 network connection to one of them simulating a network connection failure.

Anyway, the point is that a cluster failover takes time. During this time the virtual ip address is switched from one node to the other. Depending on the cluster suite this will take seconds/minutes. The fact that the client will reconnect to the surviving cluster node after you restart it proves that, had it waited long enough, it would have been able to reconnect on its own.

So the solution is to either (1) configure the cluster to failover faster, or (2) increase the timeout that clients will wait before giving up. That means that a new connection to the virtual ip address can be made before the configured timeout period ends.

hicksd8

View Public Profile for hicksd8

Find all posts by hicksd8

04-04-2020

Moderator

1,484, 567

Join Date: Mar 2011

Last Activity: 28 November 2020, 9:34 AM EST

Posts: 1,484

Thanks Given: 68

Thanked 567 Times in 444 Posts

Why address on lo interface ?
Getting address on that interface is only used in case of DSR (direct server return) balancing, which haproxy does not do.
Haproxy is L3 and above, while DSR is L2.

Can you remove the lo:0 address entry from ALL servers (LB and mail servers) ?
In your case, VIP address is only on master haproxy node (one of two) with /24 mask (not on lo, and keepalived is handing that.

Also, configure the keepalived in the following manner, then retest :

Code:

vrrp_script check_haproxy {
script "/usr/bin/killall -0 haproxy" # be sure to check the availability of killall program or configure some other check, killall is cheap.
interval 2
weight 2
}
vrrp_instance VI_1 {
        state MASTER #
        interface <your network interface for VIP address>
        virtual_router_id 51
        priority 101
# VRRP VIP
virtual_ipaddress {
          10.2.0.4
}
authentication {
        auth_type PASS
        auth_pass <some password>
}

track_script {
        check_haproxy
}
}

Haproxy keeps monitoring accessibility of (mail) backend servers, and keepalived keeps monitoring if haproxy is up.
If that is what you need and i understood correctly.

Of course, you can add additional conditions to keepalived to execute failover of VIP address, after you confirm everything is working.

Hope that helps
Regards
Peasant.

Peasant

View Public Profile for Peasant

Find all posts by Peasant

04-05-2020

Moderator

1,484, 567

Join Date: Mar 2011

Last Activity: 28 November 2020, 9:34 AM EST

Posts: 1,484

Thanks Given: 68

Thanked 567 Times in 444 Posts

A small addon for active active - so traffic flows thru both haproxys.

You need 2 VIP address for keepalived, on one node first VIP is master, on another second VIP is master.
Both will be on one node in case of node failure.

Then, you add third entry on your DNS system (mymail.example.com) -> pointing to those two VIP addresses.
This is the record you 'attack' from outside with your clients.

Since both VIP IP addresses are always active, clients will be always be able to connect to both when DNS is queried.
Client attempts to make a connection to mymail.example.com ( one VIP is returned in RR fashion from the pool of two ) --> HAPROXY --> your mail server.

Setup sticky session in haproxy and make it listen on 0.0.0.0
Be sure to allow VRRP traffic between those two LB.

In case of failure, everything hicks wrote stands, clients connected to failed VIP will notice a short failover and reconnect to second node.
But only roughly 50% of those, since half of those went to another VIP using same DNS record.

Hope that helps
Regards
Peasant.

Peasant

View Public Profile for Peasant

Find all posts by Peasant

04-13-2020

Registered User

235, 12

Join Date: Jun 2011

Last Activity: 18 May 2020, 2:12 AM EDT

Posts: 235

Thanks Given: 29

Thanked 12 Times in 11 Posts

Thanks, works fine now.

Linusolaradm1

View Public Profile for Linusolaradm1

Find all posts by Linusolaradm1

Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How do I calculate total number of active and non active hosts?

#!/bin/bash for digit in $(seq 1 10) do if ping -c1 -w2 192.168.1.$digit &> /dev/null then echo "192.168.1.$digit is UP" else echo "192.168.1.$digit is DOWN" fi done

2. Linux

active mode ftp connection from linux

Hi, We have one java client which connects to a windows server through ftp in active mode and gets files. When we run this client on hp-ux, it is able to transfer 100k files. But when we run the same client on Linux server it is able to transfer only 200 files at max and it is hanging there...

3. Shell Programming and Scripting

Extract text between two specified "constant" texts using awk

Hi All, From the title you may know that this question has been asked several times and I have done lot of Googling on this. I have a Wikipedia dump file in XML format. All the contents are in one XML file i.e. all different topics have been put in one XML file. Now I need to separate them and...

4. Solaris

Link based Active Active IPMP

Hi, I need to configure 4 ip address (same subnet and mask) in one ipmp group (two interfaces) in an active active formation (link based). Can some one provide the steps or a tutorial link. Thanks

5. AIX

Question about HACMP for active-active mode

Hi all, I am new to HACMP. So sorry for the newie question. But I did search the forum and it seems that no one asks this before. So if a 2-node cluster runs in active-active mode (and the same application), what is the benefit of using HACMP ? If it runs in active-stanby, it is easy to...

6. Windows & DOS: Issues & Discussions

Windows Active Network Connection Override

Hi All, I use two Network Connections at work: Wireless and LAN. Wireless network has no limitations, but LAN internet has a web filter. I start a download using my Wireless conn. (At this point, LAN is disabled) But when I activate my LAN connection my download stops immediately. LAN...

Login or Register to Ask a Question