Mastering DNS Zone Transfers: Resolving “Failed to Connect: Host Unreachable” Errors

At revWhiteShadow, we understand the critical importance of a robust and functional Domain Name System (DNS) infrastructure. A correctly configured DNS setup ensures that your network resources are resolvable and accessible, facilitating seamless communication and service delivery. When establishing a DNS server hierarchy, particularly with a master and slave configuration, the ability for the slave server to successfully synchronize zone data from the master is paramount. Encountering an error message such as “DNS slave says failed to connect: host unreachable” can be a perplexing obstacle, potentially halting vital data replication and impacting overall DNS service availability. We have meticulously analyzed this common predicament and are here to provide a comprehensive guide to diagnosing and resolving this issue, ensuring your DNS slave achieves successful zone transfers.

Our goal is to equip you with the knowledge and actionable steps necessary to overcome this “host unreachable” error, enabling a stable and reliable DNS environment. We will delve into the intricate details of DNS configuration, network connectivity, and potential underlying causes that can lead to this specific connection failure.

Understanding the Fundamentals of DNS Zone Transfers

Before we embark on the troubleshooting journey, it is essential to grasp the fundamental process of DNS zone transfers. When a DNS slave server is configured to synchronize zone data from a master, it initiates a zone transfer request. This transfer is typically performed using the Zone Transfer Protocol (AXFR), which allows the slave to receive a complete copy of the zone file from the master.

The process involves the following key components:

  • Master DNS Server: Holds the authoritative copy of the zone file. It is responsible for responding to zone transfer requests from its designated slave servers.
  • Slave DNS Server: Receives a copy of the zone file from the master. It serves DNS queries based on this replicated data, acting as a secondary authoritative source for the zone.
  • Zone Transfer Protocol (AXFR): The standard protocol used for transferring entire zone files between DNS servers.
  • NOTIFY Messages: When changes are made to a zone on the master server, it can optionally send NOTIFY messages to its configured slaves, signaling that an update is available and prompting them to initiate a zone transfer.
  • Serial Numbers: Each zone file has a serial number. Incremented with each change, this number helps the slave server determine if the zone data has been updated and a transfer is necessary.

The error “failed to connect: host unreachable” indicates that the slave server is unable to establish a network connection with the master server on the designated port (typically port 53 for DNS) for the purpose of performing a zone transfer.

Diagnosing the “Host Unreachable” Error: A Systematic Approach

The “failed to connect: host unreachable” error can stem from a variety of issues, ranging from simple misconfigurations to more complex network problems. We advocate for a methodical approach to identify the root cause.

Verifying Network Connectivity: The First Line of Defense

The most straightforward interpretation of “host unreachable” is a failure in network reachability. Even though you’ve indicated that both the master and slave servers can ping each other, which is a positive sign, we must ensure that DNS traffic itself can flow unimpeded.

1. Confirming IP Addresses and Hostnames

  • IP Address Accuracy: Double-check that the IP addresses specified for the master server in the slave’s named.conf file are indeed correct. In your setup, this is 192.168.102.159 on the slave’s configuration for the masters of abc.local and 102.168.192.IN-ADDR.ARPA zones. Similarly, ensure the slave’s IP address (192.168.102.132) is correctly listed in the master’s allow-transfer directive for the respective zones. A single digit error can lead to this problem.
  • Hostname Resolution: While not directly indicated as the cause here, ensure that if hostnames are used instead of IP addresses in your DNS configuration (which is not the case in your provided example but is a common pitfall), they resolve correctly on both servers.

2. Essential Network Reachability Tests

  • ping Command: As you’ve confirmed, ping is a fundamental test. Ensure that ping 192.168.102.159 from the slave server (192.168.102.132) yields successful responses. This confirms basic IP-level connectivity.
  • traceroute (or tracert on Windows): This tool helps identify the path packets take to reach the destination. Running traceroute 192.168.102.159 from the slave can reveal if any network device along the path is dropping or blocking the packets. If the traceroute stops at a particular hop, that device might be the source of the “unreachable” state.

Examining Firewall Configurations: The Most Frequent Culprit

Firewalls are designed to protect networks by controlling incoming and outgoing traffic. When misconfigured, they can inadvertently block legitimate DNS traffic, leading to the “host unreachable” error.

1. Understanding Required Firewall Rules for Zone Transfers

DNS zone transfers require specific ports to be open. The primary protocol for zone transfers is TCP, and it uses port 53. UDP is also used for DNS queries and NOTIFY messages, also on port 53.

  • Master Server Firewall: The master server needs to accept incoming TCP connections on port 53 from its slave servers for zone transfers. It also needs to accept UDP traffic on port 53 for general DNS queries and NOTIFY messages.
  • Slave Server Firewall: The slave server needs to be able to initiate outgoing TCP connections to the master on port 53 for transfers and outgoing UDP connections for queries. It also needs to accept incoming UDP traffic on port 53 from the master for NOTIFY messages.

2. Analyzing Your Provided iptables Rules

Your provided iptables rules on the master are:

iptables -A INPUT -i ens33 -p tcp -m state --state NEW,ESTABLISHED -s 192.168.102.132 --sport 1024:65535 --dport 53 -j ACCEPT

iptables -A INPUT -i ens33 -p udp -m state --state NEW,ESTABLISHED -s 192.168.102.132 --sport 1024:65535 --dport 53 -j ACCEPT

Let’s break these down:

  • -A INPUT: Appends the rule to the INPUT chain, meaning it applies to packets destined for the server itself.
  • -i ens33: Specifies the incoming network interface, which is ens33. This is good if ens33 is the interface that the slave connects to.
  • -p tcp / -p udp: Matches TCP or UDP packets.
  • -m state --state NEW,ESTABLISHED: This is a crucial part. It allows new connections and established connections. For zone transfers (which are TCP-based and initiated by the slave), this should allow NEW connections from the slave to the master.
  • -s 192.168.102.132: Matches packets originating from the slave server’s IP address.
  • --sport 1024:65535: This specifies the source port range. For zone transfers, the slave initiates the connection from a high-numbered ephemeral port. This rule correctly matches those.
  • --dport 53: Matches packets destined for port 53, the standard DNS port.
  • -j ACCEPT: The action to take if the packet matches the rule – accept it.

These rules appear correct for allowing the slave to initiate connections to the master on port 53. However, the error message “host unreachable” suggests that the slave might not be able to reach the master at all, or perhaps something on the master’s outbound path is blocking the response, or critically, that the slave’s firewall might be blocking the connection initiation.

Crucial Consideration: The error originates from the slave stating it cannot connect to the master. This implies that the slave’s firewall configuration is a more likely area of concern for the initiation of the connection.

2. Firewall on the Slave Server

You mentioned, “no firewall entries are added, unsure if I have to.” This is a significant point. If there is no explicit rule allowing outgoing DNS traffic from the slave, a default DROP or REJECT policy on the slave’s firewall could be blocking the zone transfer attempt.

We highly recommend configuring the firewall on the slave server (192.168.102.132) to allow it to initiate DNS connections.

Example iptables rules for the SLAVE server (allowing outbound DNS):

# Allow outgoing DNS queries and zone transfer attempts (TCP and UDP to port 53)
# This rule allows the slave to initiate connections to the master.
iptables -A OUTPUT -d 192.168.102.159 -p tcp --dport 53 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -d 192.168.102.159 -p udp --dport 53 -m state --state NEW,ESTABLISHED -j ACCEPT

# Allow incoming DNS responses and NOTIFY messages from the master
# This rule allows the slave to receive responses from the master.
iptables -A INPUT -s 192.168.102.159 -p tcp --sport 53 -m state --state ESTABLISHED -j ACCEPT
iptables -A INPUT -s 192.168.102.159 -p udp --sport 53 -m state --state ESTABLISHED -j ACCEPT

# If your INPUT chain has a default DROP policy, you might need to allow loopback traffic
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

# If your OUTPUT chain has a default DROP policy, you need to allow established connections
# and any traffic that should be permitted. The above rules are for specific DNS traffic.
# A common general rule for OUTPUT if a DROP policy is used:
# iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

Explanation of Slave Firewall Rules:

  • iptables -A OUTPUT -d 192.168.102.159 -p tcp --dport 53 -m state --state NEW,ESTABLISHED -j ACCEPT: This rule allows the slave to initiate new TCP connections to the master on port 53 and to maintain established connections. This is critical for the zone transfer itself.
  • iptables -A OUTPUT -d 192.168.102.159 -p udp --dport 53 -m state --state NEW,ESTABLISHED -j ACCEPT: Similarly, this allows the slave to initiate UDP traffic to the master on port 53, which is used for standard queries and potentially NOTIFY messages.
  • iptables -A INPUT -s 192.168.102.159 -p tcp --sport 53 -m state --state ESTABLISHED -j ACCEPT: This rule allows the slave to receive TCP packets from the master on port 53, but only if they are part of an established connection. This is important because the slave initiates the connection.
  • iptables -A INPUT -s 192.168.102.159 -p udp --sport 53 -m state --state ESTABLISHED -j ACCEPT: Allows established UDP connections from the master.

Important Note: Ensure your firewall default policies are set appropriately. If your default INPUT and OUTPUT policies are ACCEPT, then these specific rules might not be strictly necessary unless other rules are blocking traffic. However, it’s best practice to explicitly allow necessary traffic, especially in a production environment, and have a default DROP policy.

3. Checking Other Firewall Devices

If you have a network firewall (e.g., a router with firewall capabilities, or a dedicated appliance) between your master and slave VMs, it could also be blocking port 53 traffic. Verify its ruleset. Given they are on the same subnet (192.168.102.0/24), it’s less likely to be an intermediate network firewall unless the subnet itself has restrictions.

Investigating BIND Configuration Details

Beyond network connectivity and firewalls, specific BIND (Berkeley Internet Name Domain) configurations on both servers play a crucial role.

1. Master Server (named.conf) Configuration Review

Your master named.conf includes:

allow-transfer { 192.168.102.132; };

And for the specific zones:

zone "abc.local" { type master; file "abc.db"; allow-transfer { 192.168.102.132; }; };
zone "102.168.192.IN-ADDR.ARPA" { type master; file "cba.db"; allow-transfer { 192.168.102.132; }; };

These directives correctly specify that the slave (192.168.102.132) is permitted to perform zone transfers. This part seems correct.

2. Slave Server (named.conf) Configuration Review

Your slave named.conf includes:

allow-notify { 192.168.102.159; };

And for the zones:

zone "abc.local" { type slave; masters { 192.168.102.159; }; file "slaves/abc.db"; allow-transfer { 192.168.102.159; }; };
zone "102.168.192.IN-ADDR.ARPA" { type slave; masters { 192.168.102.159; }; file "slaves/cba.db"; allow-transfer { 192.168.102.159; }; };
  • type slave;: Correctly identifies the zone as a slave.
  • masters { 192.168.102.159; };: Correctly specifies the IP address of the master server.
  • file "slaves/abc.db";: This is important. The slave will attempt to write the transferred zone file to this location. Ensure the BIND process has write permissions to the slaves directory and the abc.db file.
  • allow-transfer { 192.168.102.159; };: While this directive is typically for the master to control transfers from it, including it on the slave here is not harmful, but it’s also not the primary directive controlling incoming zone transfers to the slave. The masters directive and the master’s allow-transfer are key.

3. File Permissions and Ownership

You mentioned adding chgrp and chown on the slave’s folder. This is a good step. The BIND process user (often named or bind) needs appropriate permissions to:

  • Read zone files on the master (though the transfer is network-based, BIND might check existence first).
  • Write the transferred zone files in the directory specified on the slave (e.g., /var/named/slaves/).

Check on the Slave: Ensure the directory /var/named/slaves/ and the file /var/named/slaves/abc.db (and the reverse lookup file) are owned by the user that the BIND process runs as.

Example command (if BIND runs as user named):

sudo chown named:named /var/named/slaves
sudo chown named:named /var/named/slaves/abc.db
sudo chown named:named /var/named/slaves/cba.db

You may need to check your system’s configuration for the exact user BIND runs as (e.g., by looking at the process owner with ps aux | grep named).

4. The rndc.key Issue in the Previous Setup

You mentioned that in an earlier setup, you encountered an rndc.key not found error, followed by the “host unreachable” error. This suggests that rndc (Remote Name Daemon Control) might have been configured to be used for zone transfers or notifications, which is not the default and often not required for basic master-slave replication.

  • rndc Configuration: If rndc is involved, it typically requires a shared secret key (rndc.key) and specific configuration in named.conf using controls and keys blocks. The fact that you only get “host unreachable” now implies rndc is no longer the primary or immediate blocking issue for the transfer itself, but it’s worth being aware of. For simple zone transfers, rndc is not strictly necessary unless you are using it for dynamic updates or remote control operations that are integrated with zone transfers.

5. The “journal file is out of date” Message

The log message 30-Dec-2018 20:33:24.030 managed-keys-zone: journal file is out of date: removing journal file is a routine message. BIND uses journal files (.jnl) for dynamic updates. If a journal file is inconsistent or missing, BIND often handles it by either recreating it or using the zone file itself. This message, by itself, is unlikely to cause the “host unreachable” error.

DNS NOTIFY and Zone Transfer Initiation

The “host unreachable” error occurs when the slave tries to initiate a connection. This initiation can happen in two primary ways:

  1. Scheduled Refresh: Slave servers periodically check the serial number of the zone on the master (based on the refresh interval in the SOA record). If the slave’s serial number is older, it will attempt a zone transfer.
  2. NOTIFY Message: The master server sends a NOTIFY message to its slaves whenever the zone’s serial number is updated. This prompts the slave to immediately request a zone transfer, rather than waiting for its scheduled refresh interval.

Your log shows:

30-Dec-2018 20:34:54.045 zone abc.local/IN: refresh: retry limit for master 192.168.102.159#53 exceeded (source 0.0.0.0#0)
30-Dec-2018 20:34:54.045 zone abc.local/IN: Transfer started.
30-Dec-2018 20:34:54.046 transfer of 'abc.local/IN' from 192.168.102.159#53: failed to connect: host unreachable

This clearly indicates the slave is trying to connect to the master. The “retry limit exceeded” further confirms that the connection attempts are failing. The fact that the source 0.0.0.0#0 is mentioned might sometimes indicate issues with interface binding or network stack problems on the slave, but more commonly, it’s a generic message when a successful handshake cannot be established.

Checking BIND Service Status and Reloading

Ensure that the BIND service (named) is running correctly on both servers.

  • On the Master:
    sudo systemctl status named
    sudo systemctl reload named
    
  • On the Slave:
    sudo systemctl status named
    sudo systemctl reload named
    

After making any configuration changes (especially firewall or named.conf), it’s crucial to reload or restart the BIND service for those changes to take effect. A reload is generally preferred as it attempts to apply changes without dropping existing connections, but a restart can sometimes clear up stubborn issues.

Zone Serial Numbers

While not a direct cause of “host unreachable,” incorrect or non-incrementing serial numbers can prevent timely zone transfers.

  • Master Zone Files (abc.db, cba.db): The serial numbers are currently 0. For proper replication, the serial number on the master must be incremented whenever the zone file is changed. For example, change 0 to 1. Then, reload the BIND service on the master. The slave will eventually detect this change (either via NOTIFY or periodic refresh) and attempt to transfer the updated zone.

    Example abc.db with incremented serial:

    $TTL 3H
    $ORIGIN abc.local.
    @       IN SOA ns1.abc.local. ns2.abc.local. (
                                            1       ; serial  <-- Incremented
                                            1D      ; refresh
                                            1H      ; retry
                                            1W      ; expire
                                            3H )    ; minimum
            IN NS ns1.abc.local.
            IN NS ns2.abc.local.
    ns1     IN A 192.168.102.159
    ns2     IN A 192.168.102.132
    

Alternative Connection Ports

While port 53 is standard, some configurations might use different ports. However, your named.conf files explicitly use port 53 (listen-on port 53), so this is unlikely to be the issue unless there’s an unusual network setup.

Advanced Troubleshooting and Potential Pitfalls

When the fundamental checks don’t yield an immediate solution, consider these more advanced aspects:

1. SE Linux or AppArmor

If your Linux distribution uses security frameworks like SELinux or AppArmor, they could be restricting BIND’s access to network ports or files, even if file permissions appear correct.

  • SELinux: Check audit logs (/var/log/audit/audit.log) for AVC denial messages related to named. You might need to adjust SELinux contexts or policies. Temporarily setting SELinux to permissive mode (sudo setenforce 0) can help diagnose if it’s the cause. Remember to re-enable it (sudo setenforce 1) and fix the policies.
  • AppArmor: Check AppArmor status and logs for denials related to named.

2. Network Configuration on the VMs

Ensure that the virtual network interfaces (ens33 in your case) on both VMs are correctly configured, up, and have the expected IP addresses assigned. The ip addr show command on both VMs will confirm this.

3. NAT and Port Forwarding

If these VMs are behind Network Address Translation (NAT), ensure that port 53 (TCP and UDP) is correctly forwarded from the public IP address to the private IP address of the master server. However, since you’re using private IP addresses (192.168.102.x), this is less likely to be the cause unless the NAT device is specifically blocking traffic within that private subnet, which is unusual.

4. Load Balancers or Proxies

Are there any load balancers or proxies in front of your DNS servers? They could interfere with direct connections. Given the setup described, this is improbable.

Resolution Steps Summary

Based on our analysis, we recommend the following prioritized steps:

  1. Implement Slave Firewall Rules: The most probable cause is the slave server’s firewall blocking outgoing DNS connections. Add the necessary iptables rules on the slave to allow outgoing TCP and UDP traffic to the master on port 53.
  2. Verify Slave File Permissions: Ensure the BIND user has read and write permissions for the zone transfer target directory and files on the slave.
  3. Reload/Restart BIND: After any configuration changes, reload or restart the named service on both master and slave servers.
  4. Check Master Logs: Examine the master server’s BIND logs for any errors related to zone transfers or connections from the slave.
  5. Increment Serial Number: Ensure the serial number on the master’s zone files is incremented and reload the master BIND.
  6. Examine rndc.key Status: While less likely the direct cause of “host unreachable,” ensure BIND is not configured to use rndc for transfers if you don’t have rndc.key properly set up. If you don’t need remote control features requiring rndc, you can remove or comment out rndc related configurations from named.conf.

By systematically addressing these points, we are confident that you will be able to resolve the “DNS slave says failed to connect: host unreachable” error and achieve successful zone transfers between your master and slave DNS servers. At revWhiteShadow, we are dedicated to ensuring your DNS infrastructure operates at peak performance.