Qnap Thin pool repair

Stop all of services and umount all of volumes

Step1: Stop all of services

/etc/init.d/services.sh stop

/etc/init.d/qsnapman.sh stop

/sbin/daemon_mgr lvmetad stop “/sbin/lvmetad” rm /var/run/lvm/lvmetad.socket

Step2: Confirm which volume group or pool need to repair

pvs # list all of volume groups

lvs -a # list all of volumespools and volume groups

lvs -o+time # can list the created date/time of volumespools lvs -o+thin_id # can list the devices id of volumes

Check Volume Groups using “pvs” command

PV VG Fmt Attr PSize PFree

/dev/drbd1 vg1 lvm2 a– 7.24t 0 # /dev/drbd1 indicates md1(RAID group 1)

/dev/drbd2 vg2 lvm2 a– 7.26t 0 # /dev/drbd2 indicates md1(RAID group 2)

.

.

.

# One Volume Group may include >= 2 RAID groups

Check devices on the volume group/pool which we need to repair using “lvs -a” command

LV

VG

Attr

LSize

Pool Origin

Data% Meta%

Move Log Cpy%Sync

Convert

      

lv1

vg1

Vwi-aot—

6.00t

tp1

100.00

# Data volume 1

lv1312

vg1

-wi-ao—-

756.00m

# snapshots pool

  

lv2

vg1

Vwi-aot— 200.00g tp1

100.00

# Data volume 2

lv544

vg1

-wi——- 74.14g

# Reserved to repairing temporary

 

snap10001

vg1

Vwi-aot—

6.00t tp1

lv1

100.00

# Snapshot volume

snap10002

vg1

Vwi-aot—

6.00t tp1

lv1

100.00

# Snapshot volume

.

      

.

      

.

      

Step3: Assume vg1 need to repair, we need to umount the volumes which belong to the vg1(lv1 will be mounted in /share/CACHEDEV1_DATA….. and so on….)

If there is any volumes or snapshots which is mounted and the mounted volumes belong to the volume group which need to repair, please umount them

# below umount all of data volumes umount /share/CACHEDEV1_DATA

umount /share/CACHEDEV2_DATA

.

.

umount /share/CACHEDEV${#}_DATA

# below umount snapshots umount /dev/mapper/vg1-snap*

if can not umount, lsof /share/CACHEDEV{#}_DATA to check, and “kill -9” these process. Try to umount the volumes again.

Remove all of cache devices and inactivate volume group

Step1: You can list which cache devices on the pool:

ls -l /dev/mapper/ # will list all of devices on the pool

Step2: The result of the above command looks like below(below assume vg1 is the volume group which need to repair)

brw——-

1

admin

administrators

253,

9

2020-02-15

10:16

cachedev1

brw——-

1

admin

administrators

253,

50

2020-02-15

10:16

cachedev2

crw——-

1

admin

administrators

10,

236

2020-02-15

18:14

control

brw——-

1

admin

administrators

253,

7

2020-02-15

10:16

vg1-lv1

brw——-

1

admin

administrators

253,

8

2020-02-15

10:16

vg1-lv1312

brw——-

1

admin

administrators

253,

10

2020-02-15

10:16

vg1-lv2

brw——-

1

admin

administrators

253,

11

2020-02-18

01:00

vg1-snap10001

brw——-

1

admin

administrators

253,

13

2020-02-19

01:00

vg1-snap10002

brw——-

1

admin

administrators

253,

15

2020-02-20

01:00

vg1-snap10003

brw——-

1

admin

administrators

253,

17

2020-02-21

01:00

vg1-snap10004

brw——-

1

admin

administrators

253,

19

2020-02-22

01:00

vg1-snap10005

.

        

.

        

.

        

Step3: Find cachedev${#} and remove them using dmsetup

dmsetup remove cachedev1 dmsetup remove cachedev2

.

.

.

dmsetup remove cachedev${#}

Step4: Inactivate the volume group which need to repair

lvchange -an vg1 # Assume vg1 is the volume group which need to repair

Step5: Please check again with “ls -l /dev/mapper” command to confirm there is no any block device of vg1. The result should be the below:

crw——- 1 admin administrators 10, 236 2020-02-15 18:14 control

Collect logs of thin pool metadata and backup metadata of the thin pool

Step1: Download collect tools

wget http://download.qnap.com/Storage/tsd/utility/tp_collect.sh

Step2: Execute collect tools to collect logs of the metadata or backup metadata of the pool

sh tp_collect.sh

When the tp_collect.sh start running, please remember enter “pool id”. The pool id is the id of the pool which we want to repair. For examples: vg1/tp1, please input 1vg2/tp2, please input 2……and so on…..

If execute tp_collect fail, you need to let the customer plug one USB external drive(about 100G or more) in the NAS to backup metadata

# assume vg1/tp1 need to collect or repair lvchange -ay vg1/tp1_tmeta

# Change directory to the USB external device cd /share/external/DEVXXXXXX

# backup metadata using dd command

dd if=/dev/mapper/vg1-tp1_tmeta of=tmeta.bin bs=1M

If tp_collect is executed completely or successfully, you can skip the above flow and please backup collect.tgz.

Step 3: Please confirm the backup of metadata is correct

# A. thin check original metadata

pdata_tools_8192(or 4096) thin_check /dev/mapper/vg1-tp1_tmeta

# B. thin check backup metadata

pdata_tools_8192(or 4096) thin_check /mnt/collect/tmeta.bin # If the metadata is backup in the USB external drive, please pdata_tools_8192(or 4096) thin_check /share/external/DEVXXXXXX/tmeta.bin

The result of the above A, B need to be the same. If the A, B is the same, Please be sure to backup the tmeta.bin to another storage or USB

external drive!!! If the vg{#}-tp{#} repaired fail, the vg{#}-tp{#}_tmeta can be restored using tmeta.bin.

 

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...