Форум системных администраторов
IT => Виртуализация => Тема начата: Gekko от 13 января 2021, 15:15:39
-
XCP-ng 8.1
Непонимаю - что произошло. С понедельника на вторник вырубился XCP-ng, и после этого падения загрузился, но уже без одной виртуалки и у некоторых VM пропали носители...
В логе crit.log на момент падения такие записи:
Jan 11 21:04:39 Xen3 kernel: [ 28.535558] EXT4-fs error (device sda1): ext4_mb_generate_buddy:747: group 54, block bitmap and bg descriptor inconsistent: 29506 vs 29507 free clusters
Jan 11 21:04:40 Xen3 kernel: [ 30.006183] EXT4-fs error (device sda1): ext4_mb_generate_buddy:747: group 0, block bitmap and bg descriptor inconsistent: 20178 vs 20418 free clusters
Jan 11 21:19:12 Xen3 kernel: [ 901.356525] EXT4-fs error (device sda1): ext4_lookup:1578: inode #196672: comm systemd-tmpfile: deleted inode referenced: 205163
Jan 11 21:19:12 Xen3 kernel: [ 901.373527] EXT4-fs error (device sda1): ext4_lookup:1578: inode #196672: comm systemd-tmpfile: deleted inode referenced: 205161
В kernel.log при подъеме сервера проскакивают среди лога загрузки такие строчки:
Jan 11 21:04:37 Xen3 kernel: [ 24.557735] FAT-fs (sda4): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
Jan 11 21:04:37 Xen3 kernel: [ 25.182511] EXT4-fs (sda5): mounting ext3 file system using the ext4 subsystem
Jan 11 21:04:37 Xen3 kernel: [ 25.195015] EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null)
...
Jan 11 21:04:39 Xen3 kernel: [ 28.535558] EXT4-fs error (device sda1): ext4_mb_generate_buddy:747: group 54, block bitmap and bg descriptor inconsistent: 29506 vs 29507 free clusters
Jan 11 21:04:39 Xen3 kernel: [ 28.672187] JBD2: Spotted dirty metadata buffer (dev = sda1, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
Jan 11 21:04:40 Xen3 kernel: [ 30.006183] EXT4-fs error (device sda1): ext4_mb_generate_buddy:747: group 0, block bitmap and bg descriptor inconsistent: 20178 vs 20418 free clusters
Jan 11 21:04:40 Xen3 kernel: [ 30.298620] JBD2: Spotted dirty metadata buffer (dev = sda1, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
...
Jan 11 21:06:58 Xen3 kernel: [ 167.910051] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null)
...
Jan 11 21:07:05 Xen3 kernel: [ 174.878426] CIFS VFS: Error connecting to socket. Aborting operation.
Jan 11 21:07:05 Xen3 kernel: [ 174.878455] CIFS VFS: cifs_mount failed w/return code = -113
Jan 11 21:07:11 Xen3 kernel: [ 181.022372] CIFS VFS: Error connecting to socket. Aborting operation.
Jan 11 21:07:11 Xen3 kernel: [ 181.022401] CIFS VFS: cifs_mount failed w/return code = -113
...
Jan 11 21:09:37 Xen3 kernel: [ 326.877130] EXT4-fs (sda1): error count since last fsck: 343
Jan 11 21:09:37 Xen3 kernel: [ 326.877168] EXT4-fs (sda1): initial error at time 1600423846: ext4_lookup:1578: inode 196609
Jan 11 21:09:37 Xen3 kernel: [ 326.877190] EXT4-fs (sda1): last error at time 1610388280: ext4_mb_generate_buddy:747
...
Jan 11 21:19:12 Xen3 kernel: [ 901.356525] EXT4-fs error (device sda1): ext4_lookup:1578: inode #196672: comm systemd-tmpfile: deleted inode referenced: 205163
Jan 11 21:19:12 Xen3 kernel: [ 901.373527] EXT4-fs error (device sda1): ext4_lookup:1578: inode #196672: comm systemd-tmpfile: deleted inode referenced: 205161
-
А вот как выглядит теперь мои два райда после этого падения:
[11:46 xen3 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 1.8T 0 disk
├─sdb4 8:20 0 110G 0 part
├─sdb2 8:18 0 200G 0 part
├─sdb5 8:21 0 293G 0 part
├─sdb3 8:19 0 100G 0 part
├─sdb1 8:17 0 100G 0 part
├─sdb6 8:22 0 1T 0 part
└─3600508b1001cdf306e8126a684c7ff89 253:0 0 1.8T 0 mpath
├─3600508b1001cdf306e8126a684c7ff89p1 253:1 0 100G 0 part
├─3600508b1001cdf306e8126a684c7ff89p6 253:6 0 1T 0 part
├─3600508b1001cdf306e8126a684c7ff89p4 253:4 0 110G 0 part
├─3600508b1001cdf306e8126a684c7ff89p2 253:2 0 200G 0 part
├─3600508b1001cdf306e8126a684c7ff89p5 253:5 0 293G 0 part
└─3600508b1001cdf306e8126a684c7ff89p3 253:3 0 100G 0 part
tdc 254:2 0 8G 0 disk
tda 254:0 0 1.8T 0 disk
sda 8:0 0 1.8T 0 disk
├─sda4 8:4 0 512M 0 part /boot/efi
├─sda2 8:2 0 18G 0 part
├─sda5 8:5 0 4G 0 part /var/log
├─sda3 8:3 0 1.8T 0 part
│ └─XSLocalEXT--b2850589--ece5--cc2b--5815--7ce4f571afd8-b2850589--ece5--cc2b--5815--7ce4f571afd8
253:7 0 1.8T 0 lvm /run/sr-mount/b2850589-ece5-cc2b-5815-7ce4f571afd8
├─sda1 8:1 0 18G 0 part /
└─sda6 8:6 0 1G 0 part [SWAP]
tdd 254:3 0 30G 0 disk
tdb 254:1 0 20G 0 disk
Хотя сам массив вроде живой:
[09:16 xen3 ~]# hpssacli
HPE Smart Storage Administrator CLI 2.40.13.0
Detecting Controllers...Done.
Type "help" for a list of supported commands.
Type "exit" to close the console.
=> ctrl all show config
Smart HBA H240 in Slot 3 (RAID Mode) (sn: PDРАСТBRH711C1)
Port Name: 2I
Port Name: 1I
Internal Drive Cage at Port 1I, Box 2, OK
Internal Drive Cage at Port 2I, Box 3, OK
array A (SATA, Unused Space: 0 MB)
logicaldrive 1 (1.8 TB, RAID 1, OK)
physicaldrive 1I:2:1 (port 1I:box 2:bay 1, SATA, 2 TB, OK)
physicaldrive 1I:2:2 (port 1I:box 2:bay 2, SATA, 2 TB, OK)
array B (SATA, Unused Space: 0 MB)
logicaldrive 2 (1.8 TB, RAID 1, OK)
physicaldrive 2I:3:1 (port 2I:box 3:bay 1, SATA, 2 TB, OK)
physicaldrive 2I:3:2 (port 2I:box 3:bay 2, SATA, 2 TB, OK)
=>