[DRBD-user] Configuring a two-node cluster with redundant nics on each node?

Discussion:

Bryan K. Walton

2018-10-17 17:07:36 UTC

Hi,

I'm trying to configure a two-node cluster, where each node has
dedicated redundant nics:

storage node 1 has two private IPs:
10.40.1.3
10.40.2.2

storage node 2 has two private IPs:
10.40.1.2
10.40.2.3

I'd like to configure the resource so that the nodes have two possible
paths to the other node. I've tried this:

resource r0 {
on storage1 {
device /dev/drbd1;
disk /dev/mapper/centos_storage1-storage;
address 10.40.2.2:7789;
address 10.40.1.3:7789;
meta-disk internal;
}
on storage2 {
device /dev/drbd1;
disk /dev/mapper/centos_storage2-storage;
address 10.40.1.2:7789;
address 10.40.2.3:7789;
meta-disk internal;
}
}

But this doesn't work. When I try to create the device metadata, I get
the following error:

drbd.d/r0.res:6: conflicting use of address statement
'r0:storage1:address' ...
drbd.d/r0.res:5: address statement 'r0:storage1:address' first used
here.

Clearly, my configuration won't work. Is there a way to accomplish what
I'd like to accomplish?

Thanks,
Bryan Walton

Adi Pircalabu

2018-10-18 05:47:53 UTC

Permalink

Post by Bryan K. Walton
Hi,
I'm trying to configure a two-node cluster, where each node has
10.40.1.3
10.40.2.2
10.40.1.2
10.40.2.3
I'd like to configure the resource so that the nodes have two possible
resource r0 {
on storage1 {
device /dev/drbd1;
disk /dev/mapper/centos_storage1-storage;
address 10.40.2.2:7789;
address 10.40.1.3:7789;
meta-disk internal;
}
on storage2 {
device /dev/drbd1;
disk /dev/mapper/centos_storage2-storage;
address 10.40.1.2:7789;
address 10.40.2.3:7789;
meta-disk internal;
}
}
But this doesn't work. When I try to create the device metadata, I get
drbd.d/r0.res:6: conflicting use of address statement
'r0:storage1:address' ...
drbd.d/r0.res:5: address statement 'r0:storage1:address' first used
here.
Clearly, my configuration won't work. Is there a way to accomplish what
I'd like to accomplish?

Why aren't you using Ethernet bonding?

--
Adi Pircalabu

Bryan K. Walton

2018-10-19 15:16:02 UTC

Permalink

Post by Adi Pircalabu
Why aren't you using Ethernet bonding?

Thanks Adi,

We are rethinking our network configuration. We may do our replication
through a directly cabled and bonded connection, and bypass our
switches. This would simplify our drbd configuration.

Thanks!
Bryan

digimer

2018-10-19 16:56:53 UTC

Permalink

Post by Bryan K. Walton

Post by Adi Pircalabu
Why aren't you using Ethernet bonding?

Thanks Adi,
We are rethinking our network configuration. We may do our replication
through a directly cabled and bonded connection, and bypass our
switches. This would simplify our drbd configuration.

We've used mode=1 (active-passive) bonding under DRBD for years across
numerous installs and countless test and prod failures without issue. We
still run through switches (link1 to switch 1, link2 to switch 2,
hitless failover config on the stack). We can and have failed NICs,
cables and switches without interruption.

We've documented this setup here;

https://www.alteeve.com/w/Build_an_m2_Anvil!#Logical_Map.3B_Hardware_And_Plumbing

digimer

Adi Pircalabu

2018-10-21 23:43:03 UTC

Permalink

Post by Bryan K. Walton

Post by Adi Pircalabu
Why aren't you using Ethernet bonding?

Thanks Adi,
We are rethinking our network configuration. We may do our replication
through a directly cabled and bonded connection, and bypass our
switches. This would simplify our drbd configuration.

Bryan,
You can use active/passive bonding as already suggested, or even LACP if
you switches support it. For the replication link, though, if you've
only two nodes, removing the switch(es) from the mix and going back to
back using LACP is a sensible option, even if only for removing the
switch as a point of failure.

--
Adi Pircalabu

Lars Ellenberg

2018-10-24 18:17:04 UTC

Permalink

Post by Adi Pircalabu

Post by Bryan K. Walton

Post by Adi Pircalabu
Why aren't you using Ethernet bonding?

Thanks Adi,
We are rethinking our network configuration. We may do our replication
through a directly cabled and bonded connection, and bypass our
switches. This would simplify our drbd configuration.

Bryan,
You can use active/passive bonding as already suggested, or even LACP if you
switches support it. For the replication link, though, if you've only two
nodes, removing the switch(es) from the mix and going back to back using
LACP is a sensible option, even if only for removing the switch as a point
of failure.

I know you know, but for the record,
if this was not only about redundancy, but also hopes
to increase bandwidth while all links are operational,
LACP does not increase bandwidth for a single TCP flow.
"bonding round robin" is the only mode that does.
Just saying.

--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed