NFS (Network File System) is the most widely server to provide files over network. With NFS server we can share folders over the network and allowed clients or system can access those shared folders and can use them in their applications. When it comes to the production environment then we should configure nfs server in high availability to rule out the single point of failure.
In this article we will discuss how we can configure nfs server high availability clustering(active-passive) with pacemaker on CentOS 7 or RHEL 7
Following are my lab details that I have used for this article,
- NFS Server 1 (nfs1.example.com) – 192.168.1.40 – Minimal CentOS 7 / RHEL 7
- NFS Server 2 (nfs2.example.com) – 192.168.1.50 – Minimal CentOS 7 / RHEL 7
- NFS Server VIP – 192.168.1.51
- Firewall enabled
- SELinux enabled
Refer the below steps to configure NFS Server active-passive clustering on CentOS 7 / RHEL 7
Step 1) Set Host name on both nfs servers and update /etc/hosts file
Login to both nfs servers and set the hostname as “nfs1.example.com” and “nfs2.example.com” respectively using hostnamectl command, Example is shown below
~]# hostnamectl set-hostname "nfs1.example.com" ~]# exec bash
Update the /etc/hosts file on both nfs servers,
192.168.1.40 nfs1.example.com 192.168.1.50 nfs2.example.com
Step 2) Update both nfs servers and install pcs packages
Use below ‘yum update’ command to apply all the updates on both nfs servers and then reboot once.
~]# yum update && reboot
Install pcs and fence-agent packages on both nfs servers,
[[email protected] ~]# yum install -y pcs fence-agents-all [[email protected] ~]# yum install -y pcs fence-agents-all
Once the pcs and fencing agents’s packages are installed then allow pcs related ports in OS firewall from both the nfs servers,
~]# firewall-cmd --permanent --add-service=high-availability ~]# firewall-cmd --reload
Now Start and enable pcsd service on both nfs nodes using beneath commands,
~]# systemctl enable pcsd ~]# systemctl start pcsd
Step 3) Authenticate nfs nodes and form a cluster
Set the password to hacluster user, pcsd service will use this user to get the cluster nodes authenticated, so let’s first set the password to hacluster user on both the nodes,
[[email protected] ~]# echo "enter_password" | passwd --stdin hacluster [[email protected] ~]# echo "enter_password" | passwd --stdin hacluster
Now authenticate the Cluster nodes, In our case nfs2.example.com will be authenticated on nfs1.example.com, run the below pcs cluster command on “nfs1”
[[email protected] ~]# pcs cluster auth nfs1.example.com nfs2.example.com Username: hacluster Password: nfs1.example.com: Authorized nfs2.example.com: Authorized [[email protected] ~]#
Now its time to form a cluster with the name “nfs_cluster” and add both nfs nodes to it. Run below “pcs cluster setup” command from any nfs node,
[[email protected] ~]# pcs cluster setup --start --name nfs_cluster nfs1.example.com \ nfs2.example.com
Enable pcs cluster service on both the nodes so that nodes will join the cluster automatically after reboot. Execute below command from either of nfs node,
[[email protected] ~]# pcs cluster enable --all nfs1.example.com: Cluster Enabled nfs2.example.com: Cluster Enabled [[email protected] ~]#
Step 4) Define Fencing device for each cluster node
Fencing is the most important part of a cluster, if any of the node goes faulty then fencing device will remove that node from the cluster. In Pacemaker fencing is defined using Stonith (Shoot The Other Node In The Head) resource.
In this tutorial we are using a shared disk of size 1 GB (/dev/sdc) as a fencing device. Let’s first find out the id of /dev/sdc disk
[[email protected] ~]# ls -l /dev/disk/by-id/
Note down the id of disk /dev/sdc as we will it in “pcs stonith” command.
Now run below “pcs stonith” command from either of the node to create fencing device(disk_fencing)
[[email protected] ~]# pcs stonith create disk_fencing fence_scsi \ pcmk_host_list="nfs1.example.com nfs2.example.com" \ pcmk_monitor_action="metadata" pcmk_reboot_action="off" \ devices="/dev/disk/by-id/wwn-0x6001405e49919dad5824dc2af5fb3ca0" \ meta provides="unfencing" [[email protected] ~]#
Verify the status of stonith using below command,
[[email protected] ~]# pcs stonith show disk_fencing (stonith:fence_scsi): Started nfs1.example.com [[email protected] ~]#
Run “pcs status” command to view status of cluster
[[email protected] ~]# pcs status Cluster name: nfs_cluster Stack: corosync Current DC: nfs2.example.com (version 1.1.16-12.el7_4.7-94ff4df) \ - partition with quorum Last updated: Sun Mar 4 03:18:47 2018 Last change: Sun Mar 4 03:16:09 2018 by root via cibadmin on nfs1.example.com 2 nodes configured 1 resource configured Online: [ nfs1.example.com nfs2.example.com ] Full list of resources: disk_fencing (stonith:fence_scsi): Started nfs1.example.com Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [[email protected] ~]#
Note: If your cluster nodes are the Virtual machines and hosted on VMware then you can use “fence_vmware_soap” fencing agent. To configure “fence_vmware_soap” as fencing agent, refer the below logical steps:
1) Verify whether your cluster nodes can reach to VMware hypervisor or Vcenter
# fence_vmware_soap -a <vCenter_IP_address> -l <user_name> -p <password> \ --ssl -z -v -o list |egrep "(nfs1.example.com|nfs2.example.com)" or # fence_vmware_soap -a <vCenter_IP_address> -l <user_name> -p <password> \ --ssl -z -o list |egrep "(nfs1.example.com|nfs2.example.com)"
if you are able to see the VM names in the output then it is fine, otherwise you need to check why cluster nodes not able to make connection esxi or vcenter.
2) Define the fencing device using below command,
# pcs stonith create vmware_fence fence_vmware_soap \ pcmk_host_map="node1:nfs1.example.com;node2:nfs2.example.com" \ ipaddr=<vCenter_IP_address> ssl=1 login=<user_name> passwd=<password>
3) check the stonith status using below command,
# pcs stonith show
Step 5) Install nfs and format nfs shared disk
Install ‘nfs-utils’ package on both nfs servers
[[email protected] ~]# yum install nfs-utils -y [[email protected] ~]# yum install nfs-utils -y
Stop and disable local “nfs-lock” service on both nodes as this service will be controlled by pacemaker
[[email protected] ~]# systemctl stop nfs-lock && systemctl disable nfs-lock [[email protected] ~]# systemctl stop nfs-lock && systemctl disable nfs-lock
Let’s assume we have a shared disk “/dev/sdb” of size 10 GB between two cluster nodes, Create partition on it and format it as xfs file system
[[email protected] ~]# fdisk /dev/sdb
Run the partprobe command on both nodes and reboot once.
~]# partprobe
Now format “/dev/sdb1” as xfs file system
[[email protected] ~]# mkfs.xfs /dev/sdb1 meta-data=/dev/sdb1 isize=256 agcount=4, agsize=655296 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=2621184, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [[email protected] ~]#
Create mount point for this file system on both the nodes,
[[email protected] ~]# mkdir /nfsshare [[email protected] ~]# mkdir /nfsshare
Step 6) Configure all required NFS resources on Cluster Nodes
Followings are the required NFS resources:
- Filesystem resource
- nfsserver resource
- exportfs resource
- IPaddr2 floating IP address resource
For Filesystem resource, we need a shared storage among the cluster nodes, we have already created partition on the shared disk (/dev/sdb1) in above steps, so we will use that partition. Use below “pcs resource create” command to define Filesystem resource from any of the node,
[[email protected] ~]# pcs resource create nfsshare Filesystem device=/dev/sdb1 \ directory=/nfsshare fstype=xfs --group nfsgrp [[email protected] ~]#
In above command we have defined NFS filesystem as “nfsshare” under the group “nfsgrp“. Now onwards all nfs resources will created under the group nfsgrp.
Create nfsserver resource with name ‘nfsd‘ using the below command,
[[email protected] ~]# pcs resource create nfsd nfsserver \ nfs_shared_infodir=/nfsshare/nfsinfo --group nfsgrp [[email protected] ~]#
Create exportfs resource with the name “nfsroot”
[[email protected] ~]# pcs resource create nfsroot exportfs clientspec="192.168.1.0/24" options=rw,sync,no_root_squash directory=/nfsshare fsid=0 --group nfsgrp [[email protected] ~]#
In the above command, clientspec indicates the allowed clients which can access the nfsshare
Create NFS IPaddr2 resource using below command,
[[email protected] ~]# pcs resource create nfsip IPaddr2 ip=192.168.1.51 \ cidr_netmask=24 --group nfsgrp [[email protected] ~]#
Now view and verify the cluster using pcs status
[[email protected] ~]# pcs status
Once you are done with NFS resources then allow nfs server ports in OS firewall from both nfs servers,
~]# firewall-cmd --permanent --add-service=nfs ~]# firewall-cmd --permanent --add-service=mountd ~]# firewall-cmd --permanent --add-service=rpc-bind ~]# firewall-cmd --reload
Step 7) Try Mounting NFS share on Clients
Now try mounting the nfs share using mount command, example is shown below
[[email protected] ~]# mkdir /mnt/nfsshare [[email protected] ~]# mount 192.168.1.51:/ /mnt/nfsshare/ [[email protected] ~]# df -Th /mnt/nfsshare Filesystem Type Size Used Avail Use% Mounted on 192.168.1.51:/ nfs4 10G 32M 10G 1% /mnt/nfsshare [[email protected] ~]# [[email protected] ~]# cd /mnt/nfsshare/ [[email protected] nfsshare]# ls nfsinfo [[email protected] nfsshare]#
For Cluster testing, stop the cluster service on any of the node and see whether nfsshare is accessible or not. Let’s assume I am going stop cluster service on “nfs1.example.com”
[[email protected] ~]# pcs cluster stop Stopping Cluster (pacemaker)... Stopping Cluster (corosync)... [[email protected] ~]#
Now go to client machine and see whether nfsshare is still accessible, In my case I am still able to access it and able to create files on it.
[[email protected] nfsshare]# touch test [[email protected] nfsshare]#
Now enable the cluster service on “nfs1.example.com” using below command,
[ro[email protected] ~]# pcs cluster start Starting Cluster... [[email protected] ~]#
That’s all from this article, it confirms that we have successfully configured NFS active-passive clustering using pacemaker. Please do share your feedback and comments in the comments section below.
Read Also : Configure Two Node Squid Cluster using Pacemaker on CentOS 7 / RHEL 7
Amazing article! I haven’t had a chance to try it yet, but does it work with NFSv4?
Yes Tomas, It will work with NFSv4
Thanks to this article, I’ve managed to get the whole NFSv4 pacemaker cluster deployed via Puppet.
One small thing, you did group the resources together, but you didn’t set any order. I had a problem where the nfsroot resource failed because the nfsshare was not yet available. I’ve configured ordering constraints to resolve it.
Hi Tomas
How exactly did you do that
Hi,
Is this type of fence devices works if i using another scenario such as app cluster?
It does work regardless of the service that is clustered.
I just noticed that you use a shared disk on VirtualBox.
What happens when you actually try to fence a node manually? For example:
# pcs stonith fence nfs2
I cannot see anything here showing that you tested it, and that it worked. I don’t use VirtualBox therefore genuinely curious.
Hi Tomas,
‘pcs stonith fence’ command should fence the mentioned node
Hello.
How can I make this procedure plus cifs shares?
Greetings.
Great article ! Exactly my need.
Just a question regarding resource configuration how much memory and cpu would you dedicate for each cluster node, for almost 50 users using a shared disk of almost 500GB?
not able to mount any suggestions?
I just added a disk as /dev/sdb, but when I issue the command ls -l /dev/disk/by-id I could not see what wwn-0x6001405e49919dad5824dc2af5fb3ca0 related to sdb, so I could not configure by your hints any any further !!! Any hints could provide ?
i have question : if i have one lun that is HA NFS on 3 clusters nodes , x ,y and z . if we suppose that cluster is mounted on server x , can we add nfs mount point on servers y z using the VIP ?
so the lun will be direct mounted to x and NFS mounted to Y and Z .
if yes ? then if server x went down will the lun will be mounted on Y or Z even if the NFS mount exist ?
Using Pacemaker we usually configure Active-Passive NFS cluster, All the services including VIP and NFS LUN will be available on active Node, let’s say x node, if due to some reasons this node went down then all services ( including NFS LUN and VIP) will be migrated to either y or z node.
I find also by this one “pcs resource create nfsshare Filesystem device=/dev/sdb1 directory=/nfsshare fstype=xfs –group nfsgrp”.
Because modern Linux will interchange the device name so I got /dev/sdb1 and /dev/sdc1 interchanged on reboot at some time and I could not find pcs create will accept UUID so once the NFS node is down whether it could resume is an unknown factor by this method !!!
perfect article for FC San storage and two physical Node Nfs cluster ,
i`ve completed one of my project with the help of this article
thank you 🙂
could you please share the possible documentation if i use dell idrac for fencing …….
Hi Pradeep,
I have got this error after applying step4: Define Fencing device for each cluster node.
Error: Error: Agent ‘disk_fencing’ is not installed or does not provide valid metadata: Agent disk_fencing not found or does not support meta-data: Invalid argument (22)
Metadata query for stonith:disk_fencing failed: Input/output error
Hi,
I tried, everything went smoothly except the last step – moving NFS shares on client.
I just could mount it on the nfs1.example.com, not another node.
The “ip a” show that the virtual floating IP was linked to the nfs1 node, not the nfs2 node, and can not ping from nfs2.
Any suggestions?
hi ,
i`m looking for Rhel HA with KVM guest cluster ?
Hi, thanks for the artical, i’m not able to mount the nfs using the vip:
[[email protected] ~]# mount 192.168.56.100:/ /mnt/nfsshare/
mount: wrong fs type, bad option, bad superblock on 192.168.56.100:/,
missing codepage or helper program, or other error
(for several filesystems (e.g. nfs, cifs) you might
need a /sbin/mount. helper program)
In some cases useful info is found in syslog – try
dmesg | tail or so.
Here is the result of lsblk, sdb1 as fencing device and sdc1 shared storage:
[[email protected] ~]# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda
├─sda1 xfs a13287c6-e902-4162-8c96-fdbc42d487a0 /boot
└─sda2 LVM2_me M4KyTZ-1BPU-aWgE-8Um0-0gKf-lPDK-dDUdhS
├─centos-root
xfs ddf9c038-feec-45e0-83fe-be5b62843499 /
└─centos-swap
swap 15a46aa8-2cf1-4c94-ae6b-982b7a38c249 [SWAP]
sdb
└─sdb1 xfs 800d010e-369c-4b33-a159-be1fab3064eb
sdc
└─sdc1 xfs 1ea91760-f20a-4358-a51a-7fce3760b72e
sr0 iso9660 CentOS 7 x86_64
2019-09-11-18-50-31-00
fixed with the installation of nfs-utils package.
How are these shared disks created for the fencing and the nfs share? “In this tutorial we are using a shared disk of size 1 GB (/dev/sdc) as a fencing device.” How is this shared disk created / mounted?
Isn’t the shared disk still a single point of failure?
Hello .
Thanks for the article . it worked!! However ,i it did not work when i partitioned sdb1 as LVM volume
Best regards