Reliability DRBD
In my last post, I showed that DRBD could be used diskless, which effectively does the same as exposing a disk with iSCSI. However, DRBD can do more than just become an iSCSI target, and its most known feature is replicating disks over a network.
This post will look into mounting a DRBD device diskless and testing its reliability when one of the two backing nodes fails and more.
I started by mounting the DRBD disk on node 3, the diskless node. If you run drbdadm status
it should show the following:
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:UpToDate
drbd2 role:Secondary
peer-disk:UpToDate
drbd3 role:Primary
peer-disk:Diskless
After it's mounted, I've created a small test file and installed pv
. I started writing the test file slowly to the disk. For now, we don't want to overload the disk or fill it up too quickly to perform reliability tests.
I gave node one a shutdown command to test the reliability under normal circumstances. After it came back, I gave node two a shutdown command.
# Diskless DRBD node 3
[email protected]:~# dd if=/dev/urandom bs=1M count=1 > /testfile
[email protected]:~# cat /testfile | pv -L 40000 -r -p -e -s 1M > /mnt/testfile
[39.3KiB/s] [=====> ] 11% ETA 0:00:23
# DRBD node 1
[email protected]:~# reboot
Connection to 192.168.178.199 closed by remote host.
Connection to 192.168.178.199 closed.
# DRBD node 2
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:UpToDate
drbd1 connection:Connecting
drbd3 role:Primary
peer-disk:Diskless
# DRBD node 1
[email protected]:~# drbdadm adjust all
Marked additional 4096 KB as out-of-sync based on AL.
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:UpToDate
drbd2 role:Secondary
peer-disk:UpToDate
drbd3 role:Primary
peer-disk:Diskless
# Diskless DRBD node 3
[email protected]:~# md5sum /testfile && md5sum /mnt/testfile
553118a49cea22b739c2cf43fa53ae86 /testfile
553118a49cea22b739c2cf43fa53ae86 /mnt/testfile
During the reboot of DRBD node one, the writes on DRBD node three were halted shortly but came back very soon after.
When applying more pressure on the disks using a 3GB test file and unlimited speed, the disk of the rebooted server became inconsistent and needed a resync.
[email protected]:~# reboot
Connection to 192.168.178.103 closed by remote host.
Connection to 192.168.178.103 closed.
[email protected]:~# ssh 192.168.178.103
[email protected]:~# drbdadm status
# No currently configured DRBD found.
[email protected]:~# drbdadm adjust all
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:Inconsistent
drbd1 role:Secondary
replication:SyncTarget peer-disk:UpToDate done:7.28
drbd3 role:Primary
peer-disk:Diskless resync-suspended:dependency
# Diskless DRBD node 3
[email protected]:~# md5sum /testfile && md5sum /mnt/testfile
d67f12594b8f29c77fc37a1d81f6f981 /testfile
d67f12594b8f29c77fc37a1d81f6f981 /mnt/testfile
[email protected]:~# md5sum /testfile && md5sum /mnt/testfile
d67f12594b8f29c77fc37a1d81f6f981 /testfile
d67f12594b8f29c77fc37a1d81f6f981 /mnt/testfile
[email protected]:~# md5sum /testfile && md5sum /mnt/testfile
d67f12594b8f29c77fc37a1d81f6f981 /testfile
d67f12594b8f29c77fc37a1d81f6f981 /mnt/testfile
So DRBD seems to be very stable when the servers are rebooted gracefully. But what happens if we reboot them both?
[email protected]:~# cat /testfile | pv -r -p -e -s 3000M > /mnt/testfile; md5sum /testfile && md5sum /mnt/testfile
[3.67MiB/s] [=======================================> ] 36% ETA 0:00:22
pv: write failed: Read-only file system
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.824570] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.825393] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.826171] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.826876] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.827601] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.828365] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
Message from [email protected] at Feb 28 13:50:38 ...
kernel:[ 4498.829102] EXT4-fs (drbd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
d67f12594b8f29c77fc37a1d81f6f981 /testfile
md5sum: /mnt/testfile: Input/output error
[email protected]:~# md5sum /testfile && md5sum /mnt/testfile
d67f12594b8f29c77fc37a1d81f6f981 /testfile
2f80ddfb7fe21b9294b2e3663c0a0644 /mnt/testfile
[email protected]:~# mount | grep mnt
/dev/drbd1 on /mnt type ext4 (ro,relatime)
It doesn't like that. But the disk seems to be OK to the point where it could write. Of course, you don't want this to happen, but at least the disks are still mountable and readable.
What if the network starts flapping?
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:UpToDate
drbd2 role:Secondary
peer-disk:UpToDate
drbd3 role:Primary
peer-disk:Diskless
... Connectivity failure due to tagging VM with wrong VLAN in Hyper-V
... Restoring VLAN settings
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:UpToDate
drbd2 connection:Connecting
drbd3 connection:Connecting
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:Inconsistent
drbd2 role:Secondary
replication:SyncTarget peer-disk:UpToDate done:5.39
drbd3 role:Primary
peer-disk:Diskless resync-suspended:dependency
Writing to the disk was just as fast as writing when both were available.
What if we have a broken network connection that allows 10Mbps?



However, this only seems to work on outgoing traffic, not incoming traffic. While reading from the disk, both nodes are limited at 10Mbps if one of them is.

When setting the DRBD "test-disk" down on node 1, the speed of node 2 became unlimited again.

Interesting to see it balances the reads on both nodes.
What if a node gets panicked during writes?
Let's reset DRBD node two while writing at full speed.
[email protected]:~# packet_write_wait: Connection to 192.168.178.103 port 22: Broken pipe
[email protected]:~# ssh 192.168.178.103
Last login: Mon Feb 28 14:32:18 2022 from 192.168.178.47
[email protected]:~# drbdadm status
# No currently configured DRBD found.
[email protected]:~# drbdadm adjust all
Marked additional 4948 MB as out-of-sync based on AL.
[email protected]:~# drbdadm status
test-disk role:Secondary
disk:Inconsistent
drbd1 role:Secondary
replication:SyncTarget peer-disk:UpToDate done:0.21
drbd3 role:Primary
peer-disk:Diskless resync-suspended:dependency
Besides a short hiccup, we don't notice anything after DRBD declares node two unavailable.
Conclusion
DRBD has shown to be very stable. Rebooting or resetting DRBD nodes will result in a short hiccup but will continue to work just fine. I couldn't yet figure out why limiting one node its network bandwidth results in both nodes being limited in read-speed, and I'd like to see that being balanced based on the congestion of the network. In the next DRBD post, I hope to look at LINSTOR.