Discussion:
[DRBD-user] drbd resyncing entire device after each reboot
Hanspeter Kunz
2018-10-05 20:02:27 UTC
Permalink
Hi there,

I see a strange behavior on a freshly set up pair of machines (debian
stretch, drbd 8.4.7):

after each reboot, the whole drbd device is resynced from scratch, even
if both drbd devices report to be uptodate before the reboot. I never
experienced this on other drbd installations I have.

I just rebooted the secondary machine, after starting drbd syslog gives
me the following information on that machine:

Oct 5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
Oct 5 21:36:43 claire drbd[3578]: create res: nfs
Oct 5 21:36:43 claire drbd[3578]: prepare disk: nfs
Oct 5 21:36:43 claire kernel: [ 379.663592] drbd nfs: Starting worker thread (from drbdsetup-84 [3596])
Oct 5 21:36:43 claire kernel: [ 379.664004] block drbd0: disk( Diskless -> Attaching )
Oct 5 21:36:43 claire kernel: [ 379.664629] drbd nfs: Method to ensure write ordering: flush
Oct 5 21:36:43 claire kernel: [ 379.664634] block drbd0: max BIO size = 1048576
Oct 5 21:36:43 claire kernel: [ 379.664642] block drbd0: drbd_bm_resize called with capacity == 53685452728
Oct 5 21:36:43 claire kernel: [ 379.875816] block drbd0: resync bitmap: bits=6710681591 words=104854400 pages=204794
Oct 5 21:36:43 claire kernel: [ 379.875819] block drbd0: size = 25 TB (26842726364 KB)
Oct 5 21:36:44 claire drbd[3578]: adjust disk: nfs
Oct 5 21:36:44 claire kernel: [ 381.510770] block drbd0: recounting of set bits took additional 32 jiffies
Oct 5 21:36:44 claire kernel: [ 381.510772] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:44 claire kernel: [ 381.510778] block drbd0: disk( Attaching -> UpToDate )
Oct 5 21:36:44 claire kernel: [ 381.510789] block drbd0: attached to UUIDs 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:44 claire drbd[3578]: adjust net: nfs
Oct 5 21:36:44 claire drbd[3578]: ]
Oct 5 21:36:44 claire kernel: [ 381.516705] drbd nfs: conn( StandAlone -> Unconnected )
Oct 5 21:36:44 claire kernel: [ 381.516756] drbd nfs: Starting receiver thread (from drbd_w_nfs [3598])
Oct 5 21:36:44 claire kernel: [ 381.516823] drbd nfs: receiver (re)started
Oct 5 21:36:44 claire kernel: [ 381.516883] drbd nfs: conn( Unconnected -> WFConnection )
Oct 5 21:36:45 claire kernel: [ 382.250879] drbd nfs: Handshake successful: Agreed network protocol version 101
Oct 5 21:36:45 claire kernel: [ 382.250884] drbd nfs: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Oct 5 21:36:45 claire kernel: [ 382.251202] drbd nfs: Peer authenticated using 20 bytes HMAC
Oct 5 21:36:45 claire kernel: [ 382.251307] drbd nfs: conn( WFConnection -> WFReportParams )
Oct 5 21:36:45 claire kernel: [ 382.251366] drbd nfs: Starting ack_recv thread (from drbd_r_nfs [3607])
Oct 5 21:36:45 claire kernel: [ 382.310672] block drbd0: drbd_sync_handshake:
Oct 5 21:36:45 claire kernel: [ 382.310680] block drbd0: self 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7 bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310687] block drbd0: peer 06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8B7 bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310691] block drbd0: uuid_compare()=-2 by rule 20
Oct 5 21:36:45 claire kernel: [ 382.310696] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake.
Oct 5 21:36:47 claire kernel: [ 383.728620] block drbd0: bitmap WRITE of 204794 pages took 1228 ms
Oct 5 21:36:47 claire kernel: [ 383.728626] block drbd0: 25 TB (6710681591 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:47 claire kernel: [ 383.728693] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Oct 5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a TTY; using /dev/console.
Oct 5 21:36:47 claire systemd[1]: Started LSB: Control DRBD resources..
Oct 5 21:36:47 claire kernel: [ 384.049775] block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145044] block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145049] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Oct 5 21:36:47 claire kernel: [ 384.275789] block drbd0: updated sync uuid 0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:47 claire kernel: [ 384.275945] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
Oct 5 21:36:47 claire kernel: [ 384.279872] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Oct 5 21:36:47 claire kernel: [ 384.279905] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Oct 5 21:36:47 claire kernel: [ 384.279949] block drbd0: Began resync as SyncTarget (will sync 26842726364 KB [6710681591 bits set]).

Probably the explanation is simple, I just do not see it.

If you need the configuration (although it should be identical to
similar drbd configs which are working without problems) I am happy to
provide it.

Best and many thanks if any body could shed some light on this,
Hp
Digimer
2018-10-06 04:02:58 UTC
Permalink
Post by Hanspeter Kunz
Hi there,
I see a strange behavior on a freshly set up pair of machines (debian
after each reboot, the whole drbd device is resynced from scratch, even
if both drbd devices report to be uptodate before the reboot. I never
experienced this on other drbd installations I have.
I just rebooted the secondary machine, after starting drbd syslog gives
Oct 5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
Oct 5 21:36:43 claire drbd[3578]: create res: nfs
Oct 5 21:36:43 claire drbd[3578]: prepare disk: nfs
Oct 5 21:36:43 claire kernel: [ 379.663592] drbd nfs: Starting worker thread (from drbdsetup-84 [3596])
Oct 5 21:36:43 claire kernel: [ 379.664004] block drbd0: disk( Diskless -> Attaching )
Oct 5 21:36:43 claire kernel: [ 379.664629] drbd nfs: Method to ensure write ordering: flush
Oct 5 21:36:43 claire kernel: [ 379.664634] block drbd0: max BIO size = 1048576
Oct 5 21:36:43 claire kernel: [ 379.664642] block drbd0: drbd_bm_resize called with capacity == 53685452728
Oct 5 21:36:43 claire kernel: [ 379.875816] block drbd0: resync bitmap: bits=6710681591 words=104854400 pages=204794
Oct 5 21:36:43 claire kernel: [ 379.875819] block drbd0: size = 25 TB (26842726364 KB)
Oct 5 21:36:44 claire drbd[3578]: adjust disk: nfs
Oct 5 21:36:44 claire kernel: [ 381.510770] block drbd0: recounting of set bits took additional 32 jiffies
Oct 5 21:36:44 claire kernel: [ 381.510772] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:44 claire kernel: [ 381.510778] block drbd0: disk( Attaching -> UpToDate )
Oct 5 21:36:44 claire kernel: [ 381.510789] block drbd0: attached to UUIDs 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:44 claire drbd[3578]: adjust net: nfs
Oct 5 21:36:44 claire drbd[3578]: ]
Oct 5 21:36:44 claire kernel: [ 381.516705] drbd nfs: conn( StandAlone -> Unconnected )
Oct 5 21:36:44 claire kernel: [ 381.516756] drbd nfs: Starting receiver thread (from drbd_w_nfs [3598])
Oct 5 21:36:44 claire kernel: [ 381.516823] drbd nfs: receiver (re)started
Oct 5 21:36:44 claire kernel: [ 381.516883] drbd nfs: conn( Unconnected -> WFConnection )
Oct 5 21:36:45 claire kernel: [ 382.250879] drbd nfs: Handshake successful: Agreed network protocol version 101
Oct 5 21:36:45 claire kernel: [ 382.250884] drbd nfs: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Oct 5 21:36:45 claire kernel: [ 382.251202] drbd nfs: Peer authenticated using 20 bytes HMAC
Oct 5 21:36:45 claire kernel: [ 382.251307] drbd nfs: conn( WFConnection -> WFReportParams )
Oct 5 21:36:45 claire kernel: [ 382.251366] drbd nfs: Starting ack_recv thread (from drbd_r_nfs [3607])
Oct 5 21:36:45 claire kernel: [ 382.310680] block drbd0: self 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7 bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310687] block drbd0: peer 06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8B7 bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310691] block drbd0: uuid_compare()=-2 by rule 20
Oct 5 21:36:45 claire kernel: [ 382.310696] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake.
Oct 5 21:36:47 claire kernel: [ 383.728620] block drbd0: bitmap WRITE of 204794 pages took 1228 ms
Oct 5 21:36:47 claire kernel: [ 383.728626] block drbd0: 25 TB (6710681591 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:47 claire kernel: [ 383.728693] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Oct 5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a TTY; using /dev/console.
Oct 5 21:36:47 claire systemd[1]: Started LSB: Control DRBD resources..
Oct 5 21:36:47 claire kernel: [ 384.049775] block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145044] block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145049] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Oct 5 21:36:47 claire kernel: [ 384.275789] block drbd0: updated sync uuid 0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:47 claire kernel: [ 384.275945] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
Oct 5 21:36:47 claire kernel: [ 384.279872] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Oct 5 21:36:47 claire kernel: [ 384.279905] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Oct 5 21:36:47 claire kernel: [ 384.279949] block drbd0: Began resync as SyncTarget (will sync 26842726364 KB [6710681591 bits set]).
Probably the explanation is simple, I just do not see it.
If you need the configuration (although it should be identical to
similar drbd configs which are working without problems) I am happy to
provide it.
Best and many thanks if any body could shed some light on this,
Hp
Can you share your config? Are you using thin LVM?

Also, 8.4.7 is _ancient_. Nearly countless bug fixes since then, which
may or may not relate. In any case, updating is _strongly_ recommended.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
Hanspeter Kunz
2018-10-08 07:52:13 UTC
Permalink
Post by Digimer
Post by Hanspeter Kunz
Hi there,
I see a strange behavior on a freshly set up pair of machines (debian
after each reboot, the whole drbd device is resynced from scratch, even
if both drbd devices report to be uptodate before the reboot. I never
experienced this on other drbd installations I have.
I just rebooted the secondary machine, after starting drbd syslog gives
Oct 5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
Oct 5 21:36:43 claire drbd[3578]: create res: nfs
Oct 5 21:36:43 claire drbd[3578]: prepare disk: nfs
Oct 5 21:36:43 claire kernel: [ 379.663592] drbd nfs: Starting
worker thread (from drbdsetup-84 [3596])
Oct 5 21:36:43 claire kernel: [ 379.664004] block drbd0: disk(
Diskless -> Attaching )
Oct 5 21:36:43 claire kernel: [ 379.664629] drbd nfs: Method to
ensure write ordering: flush
Oct 5 21:36:43 claire kernel: [ 379.664634] block drbd0: max BIO size = 1048576
drbd_bm_resize called with capacity == 53685452728
Oct 5 21:36:43 claire kernel: [ 379.875816] block drbd0: resync
bitmap: bits=6710681591 words=104854400 pages=204794
Oct 5 21:36:43 claire kernel: [ 379.875819] block drbd0: size =
25 TB (26842726364 KB)
Oct 5 21:36:44 claire drbd[3578]: adjust disk: nfs
recounting of set bits took additional 32 jiffies
Oct 5 21:36:44 claire kernel: [ 381.510772] block drbd0: 0 KB (0
bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:44 claire kernel: [ 381.510778] block drbd0: disk(
Attaching -> UpToDate )
Oct 5 21:36:44 claire kernel: [ 381.510789] block drbd0: attached
to UUIDs
0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:44 claire drbd[3578]: adjust net: nfs
Oct 5 21:36:44 claire drbd[3578]: ]
Oct 5 21:36:44 claire kernel: [ 381.516705] drbd nfs: conn(
StandAlone -> Unconnected )
Oct 5 21:36:44 claire kernel: [ 381.516756] drbd nfs: Starting
receiver thread (from drbd_w_nfs [3598])
Oct 5 21:36:44 claire kernel: [ 381.516823] drbd nfs: receiver (re)started
Oct 5 21:36:44 claire kernel: [ 381.516883] drbd nfs: conn(
Unconnected -> WFConnection )
Oct 5 21:36:45 claire kernel: [ 382.250879] drbd nfs: Handshake
successful: Agreed network protocol version 101
Oct 5 21:36:45 claire kernel: [ 382.250884] drbd nfs: Feature
flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Oct 5 21:36:45 claire kernel: [ 382.251202] drbd nfs: Peer
authenticated using 20 bytes HMAC
Oct 5 21:36:45 claire kernel: [ 382.251307] drbd nfs: conn(
WFConnection -> WFReportParams )
Oct 5 21:36:45 claire kernel: [ 382.251366] drbd nfs: Starting
ack_recv thread (from drbd_r_nfs [3607])
Oct 5 21:36:45 claire kernel: [ 382.310680] block drbd0: self
0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310687] block drbd0: peer
06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8B7
bits:0 flags:0
uuid_compare()=-2 by rule 20
Oct 5 21:36:45 claire kernel: [ 382.310696] block drbd0: Writing
the whole bitmap, full sync required after drbd_sync_handshake.
Oct 5 21:36:47 claire kernel: [ 383.728620] block drbd0: bitmap
WRITE of 204794 pages took 1228 ms
Oct 5 21:36:47 claire kernel: [ 383.728626] block drbd0: 25 TB
(6710681591 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:47 claire kernel: [ 383.728693] block drbd0: peer(
Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk(
UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Oct 5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a TTY; using /dev/console.
Oct 5 21:36:47 claire systemd[1]: Started LSB: Control DRBD
resources..
Oct 5 21:36:47 claire kernel: [ 384.049775] block drbd0: receive
bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145044] block drbd0: send
bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145049] block drbd0: conn(
WFBitMapT -> WFSyncUUID )
Oct 5 21:36:47 claire kernel: [ 384.275789] block drbd0: updated
sync uuid
0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:47 claire kernel: [ 384.275945] block drbd0: helper
command: /sbin/drbdadm before-resync-target minor-0
Oct 5 21:36:47 claire kernel: [ 384.279872] block drbd0: helper
command: /sbin/drbdadm before-resync-target minor-0 exit code 0
(0x0)
Oct 5 21:36:47 claire kernel: [ 384.279905] block drbd0: conn(
WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Oct 5 21:36:47 claire kernel: [ 384.279949] block drbd0: Began
resync as SyncTarget (will sync 26842726364 KB [6710681591 bits
set]).
Probably the explanation is simple, I just do not see it.
If you need the configuration (although it should be identical to
similar drbd configs which are working without problems) I am happy to
provide it.
Best and many thanks if any body could shed some light on this,
Hp
Can you share your config? Are you using thin LVM?
this is my config as reported by "drbdsetup show":

resource nfs {
options {
}
net {
max-buffers 131072;
cram-hmac-alg "sha1";
shared-secret "REMOVED";
verify-alg "sha1";
}
_remote_host {
address ipv4 192.168.3.182:7788;
}
_this_host {
address ipv4 192.168.3.181:7788;
volume 0 {
device minor 0;
disk "/dev/storage/nfs";
meta-disk internal;
disk {
resync-rate 122880k; # bytes/second
al-extents 3389;
c-fill-target 40960s; # bytes
c-max-rate 4096000k; # bytes/second
c-min-rate 81920k; # bytes/second
}
}
}
}

this is the volume information for /dev/storage/nfs

lvdisplay /dev/storage/nfs
--- Logical volume ---
LV Path /dev/storage/nfs
LV Name nfs
VG Name storage
LV UUID TcncF5-uhtd-d9ea-C1fO-cu4U-eo06-2Y0UCq
LV Write Access read/write
LV Creation host, time claris, 2018-09-27 14:28:14 +0200
LV Status available
# open 2
LV Size 25.00 TiB
Current LE 6553600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:0
Post by Digimer
Also, 8.4.7 is _ancient_. Nearly countless bug fixes since then, which
may or may not relate. In any case, updating is _strongly_
recommended.
ok, I might give this a try (right now I use what is shipped with
debian stable). Remember, I have more or less exactly the same setup
running on quote a few other machines (since many years) without
problems, so I do not think that updating will solve the above problem.

Many thanks,
Hp
Hanspeter Kunz
2018-10-08 08:45:36 UTC
Permalink
Post by Hanspeter Kunz
Post by Digimer
Post by Hanspeter Kunz
Hi there,
I see a strange behavior on a freshly set up pair of machines (debian
after each reboot, the whole drbd device is resynced from
scratch,
even
if both drbd devices report to be uptodate before the reboot. I never
experienced this on other drbd installations I have.
I just rebooted the secondary machine, after starting drbd syslog gives
Oct 5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
Oct 5 21:36:43 claire drbd[3578]: create res: nfs
Oct 5 21:36:43 claire drbd[3578]: prepare disk: nfs
Oct 5 21:36:43 claire kernel: [ 379.663592] drbd nfs: Starting
worker thread (from drbdsetup-84 [3596])
Oct 5 21:36:43 claire kernel: [ 379.664004] block drbd0: disk(
Diskless -> Attaching )
Oct 5 21:36:43 claire kernel: [ 379.664629] drbd nfs: Method to
ensure write ordering: flush
Oct 5 21:36:43 claire kernel: [ 379.664634] block drbd0: max
BIO
size = 1048576
drbd_bm_resize called with capacity == 53685452728
Oct 5 21:36:43 claire kernel: [ 379.875816] block drbd0: resync
bitmap: bits=6710681591 words=104854400 pages=204794
Oct 5 21:36:43 claire kernel: [ 379.875819] block drbd0: size =
25 TB (26842726364 KB)
Oct 5 21:36:44 claire drbd[3578]: adjust disk: nfs
recounting of set bits took additional 32 jiffies
Oct 5 21:36:44 claire kernel: [ 381.510772] block drbd0: 0 KB (0
bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:44 claire kernel: [ 381.510778] block drbd0: disk(
Attaching -> UpToDate )
attached
to UUIDs
0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8
B7
Oct 5 21:36:44 claire drbd[3578]: adjust net: nfs
Oct 5 21:36:44 claire drbd[3578]: ]
Oct 5 21:36:44 claire kernel: [ 381.516705] drbd nfs: conn(
StandAlone -> Unconnected )
Oct 5 21:36:44 claire kernel: [ 381.516756] drbd nfs: Starting
receiver thread (from drbd_w_nfs [3598])
Oct 5 21:36:44 claire kernel: [ 381.516823] drbd nfs: receiver (re)started
Oct 5 21:36:44 claire kernel: [ 381.516883] drbd nfs: conn(
Unconnected -> WFConnection )
Oct 5 21:36:45 claire kernel: [ 382.250879] drbd nfs: Handshake
successful: Agreed network protocol version 101
Oct 5 21:36:45 claire kernel: [ 382.250884] drbd nfs: Feature
flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Oct 5 21:36:45 claire kernel: [ 382.251202] drbd nfs: Peer
authenticated using 20 bytes HMAC
Oct 5 21:36:45 claire kernel: [ 382.251307] drbd nfs: conn(
WFConnection -> WFReportParams )
Oct 5 21:36:45 claire kernel: [ 382.251366] drbd nfs: Starting
ack_recv thread (from drbd_r_nfs [3607])
Oct 5 21:36:45 claire kernel: [ 382.310680] block drbd0: self
0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8
B7
bits:0 flags:0
Oct 5 21:36:45 claire kernel: [ 382.310687] block drbd0: peer
06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8
B7
bits:0 flags:0
uuid_compare()=-2 by rule 20
Writing
the whole bitmap, full sync required after drbd_sync_handshake.
Oct 5 21:36:47 claire kernel: [ 383.728620] block drbd0: bitmap
WRITE of 204794 pages took 1228 ms
Oct 5 21:36:47 claire kernel: [ 383.728626] block drbd0: 25 TB
(6710681591 bits) marked out-of-sync by on disk bit-map.
Oct 5 21:36:47 claire kernel: [ 383.728693] block drbd0: peer(
Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk(
UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Oct 5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a
TTY;
using /dev/console.
Oct 5 21:36:47 claire systemd[1]: Started LSB: Control DRBD resources..
receive
bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145044] block drbd0: send
bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
compression: 100.0%
Oct 5 21:36:47 claire kernel: [ 384.145049] block drbd0: conn(
WFBitMapT -> WFSyncUUID )
updated
sync uuid
0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8
B7
Oct 5 21:36:47 claire kernel: [ 384.275945] block drbd0: helper
command: /sbin/drbdadm before-resync-target minor-0
Oct 5 21:36:47 claire kernel: [ 384.279872] block drbd0: helper
command: /sbin/drbdadm before-resync-target minor-0 exit code 0
(0x0)
Oct 5 21:36:47 claire kernel: [ 384.279905] block drbd0: conn(
WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Oct 5 21:36:47 claire kernel: [ 384.279949] block drbd0: Began
resync as SyncTarget (will sync 26842726364 KB [6710681591 bits
set]).
Probably the explanation is simple, I just do not see it.
If you need the configuration (although it should be identical to
similar drbd configs which are working without problems) I am
happy
to
provide it.
Best and many thanks if any body could shed some light on this,
Hp
Can you share your config? Are you using thin LVM?
resource nfs {
options {
}
net {
max-buffers 131072;
cram-hmac-alg "sha1";
shared-secret "REMOVED";
verify-alg "sha1";
}
_remote_host {
address ipv4 192.168.3.182:7788;
}
_this_host {
address ipv4 192.168.3.181:7788;
volume 0 {
device minor 0;
disk "/dev/storage/nfs";
meta-disk internal;
disk {
resync-rate 122880k; # bytes/second
al-extents 3389;
c-fill-target 40960s; # bytes
c-max-rate 4096000k; # bytes/second
c-min-rate 81920k; # bytes/second
}
}
}
}
this is the volume information for /dev/storage/nfs
lvdisplay /dev/storage/nfs
--- Logical volume ---
LV Path /dev/storage/nfs
LV Name nfs
VG Name storage
LV UUID TcncF5-uhtd-d9ea-C1fO-cu4U-eo06-2Y0UCq
LV Write Access read/write
LV Creation host, time claris, 2018-09-27 14:28:14 +0200
LV Status available
# open 2
LV Size 25.00 TiB
Current LE 6553600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:0
Post by Digimer
Also, 8.4.7 is _ancient_. Nearly countless bug fixes since then, which
may or may not relate. In any case, updating is _strongly_
recommended.
ok, I might give this a try (right now I use what is shipped with
debian stable). Remember, I have more or less exactly the same setup
running on quote a few other machines (since many years) without
problems, so I do not think that updating will solve the above
problem.
I just switched the drbd primaries and rebooted the new secondary. that
worked. Now, even after switching the primary back to the first machine
oder having two secodaries, rebooting either of the machines, drbd
starts up as expected (without re-syncing the whole device).

although it seems to work as expected now, I would still be interested
in knowing what might have caused this (and why switching primaries
apparently repaired it) - if anybody has an idea.

Best and many thanks,
Hp
--
Hanspeter Kunz University of Zurich
Systems Administrator Department of Informatics
Email: ***@ifi.uzh.ch Binzmühlestrasse 14
Tel: +41.(0)44.63-56714 Office 2.E.07
http://www.ifi.uzh.ch CH-8050 Zurich, Switzerland

Spamtraps: ***@ailab.ch ***@ifi.uzh.ch
---
A word to the wise is enough.
-- Miguel de Cervantes
Robert Altnoeder
2018-10-08 09:07:55 UTC
Permalink
Post by Hanspeter Kunz
Oct 5 21:36:44 claire kernel: [ 381.510789] block drbd0: attached to UUIDs 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
Oct 5 21:36:45 claire kernel: [ 382.310680] block drbd0: self 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7 bits:0 flags:0
A current UUID of 4 indicates freshly created DRBD meta data (in other
words, someone ran 'drbdadm create-md' on it)
Post by Hanspeter Kunz
Oct 5 21:36:47 claire kernel: [ 384.275789] block drbd0: updated sync uuid 0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
...because that's just 1 bit away from a value of 4 again, which seems
unlikely if this number was generated randomly.
On the other hand, this may be an intentionally chosen value that I
don't know about. Will have to check.

br,
Robert

Loading...