How-to expand BlueFS DB device of Ceph Bluestore OSD
In this example we move/expand Bluestore block.db and block.wal devices
Only for Luminous 12.2.11+! Previous ceph-bluestore-tool is corrupts osds.
1. Get partition number of your NVMe via ceph-disk and lookup to bluestore meta.
[k0ste@ceph-osd5]$ sudo ceph-disk list /dev/sdl
/dev/sdl :
/dev/sdl1 ceph data, active, cluster ceph, osd.73, block /dev/sdl2, block.db /dev/nvme2n1p13, block.wal /dev/nvme2n1p14
/dev/sdl2 ceph block, for /dev/sdl1
[k0ste@ceph-osd5]$ sudo ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-73
infering bluefs devices from bluestore path
{
"/var/lib/ceph/osd/ceph-73/block": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 6001069199360,
"btime": "2018-06-28 14:44:38.392910",
"description": "main",
"bluefs": "1",
"ceph_fsid": "5532c4fd-60db-43ff-af9a-c4eb8523382b",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "73"
},
"/var/lib/ceph/osd/ceph-73/block.wal": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 1073741824,
"btime": "2018-06-28 14:44:38.432863",
"description": "bluefs wal"
},
"/var/lib/ceph/osd/ceph-73/block.db": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 30064771072,
"btime": "2018-06-28 14:44:38.432187",
"description": "bluefs db"
}
}
2. Lookup to partition table via sgdisk to determine free space of NVMe.
[k0ste@ceph-osd5]$ sudo sgdisk -p /dev/nvme2n1
Disk /dev/nvme2n1: 781422768 sectors, 372.6 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): F0598F35-1CF1-45B7-AE94-598B2269ED84
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 781422734
Partitions will be aligned on 2048-sector boundaries
Total free space is 399741037 sectors (190.6 GiB)
Number Start (sector) End (sector) Size Code Name
1 2048 62916607 30.0 GiB FFFF ceph block.db
2 62916608 65013759 1024.0 MiB FFFF ceph block.wal
3 65013760 127928319 30.0 GiB FFFF ceph block.db
4 127928320 130025471 1024.0 MiB FFFF ceph block.wal
5 130025472 192940031 30.0 GiB FFFF ceph block.db
6 192940032 195037183 1024.0 MiB FFFF ceph block.wal
7 195037184 257951743 30.0 GiB FFFF ceph block.db
8 257951744 260048895 1024.0 MiB FFFF ceph block.wal
9 260048896 318769151 28.0 GiB FFFF ceph block.db
10 318769152 320866303 1024.0 MiB FFFF ceph block.wal
11 320866304 379586559 28.0 GiB FFFF ceph block.db
12 379586560 381683711 1024.0 MiB FFFF ceph block.wal
# RocksDB db partition
[k0ste@ceph-osd5]$ sudo sgdisk -i 11 /dev/nvme2n1
Partition GUID code: 30CD0809-C2B2-499C-8879-2D6B78529876 (Unknown)
Partition unique GUID: 107B5B24-3A48-47A9-8686-595738D6235E
First sector: 320866304 (at 153.0 GiB)
Last sector: 379586559 (at 181.0 GiB)
Partition size: 58720256 sectors (28.0 GiB)
Attribute flags: 0000000000000000
Partition name: 'ceph block.db'
# RocksDB wal partition
[k0ste@ceph-osd5]$ sudo sgdisk -i 12 /dev/nvme2n1
Partition GUID code: 5CE17FCE-4087-4169-B7FF-056CC58473F9 (Unknown)
Partition unique GUID: E8E4EC9D-10B8-45E9-804B-017940554918
First sector: 379586560 (at 181.0 GiB)
Last sector: 381683711 (at 182.0 GiB)
Partition size: 2097152 sectors (1024.0 MiB)
Attribute flags: 0000000000000000
Partition name: 'ceph block.wal'
3. Create two new partitions for db and wal accordingly.
# create db partition - 30G
[k0ste@ceph-osd5]$ sudo sgdisk --new=13:715083000:+30GiB --change-name="13:ceph block.db" --typecode="13:30cd0809-c2b2-499c-8879-2d6b78529876" --mbrtogpt /dev/nvme2n1
Information: Moved requested sector from 715083000 to 715081728 in
order to align on 2048-sector boundaries.
Setting name!
partNum is 12
REALLY setting name!
# create wal partition - 1G
[k0ste@ceph-osd5]$ sudo sgdisk --new=14:778000000:+1024MiB --change-name="14:ceph block.wal" --typecode="14:5ce17fce-4087-4169-b7ff-056cc58473f9" --mbrtogpt /dev/nvme2n1
Information: Moved requested sector from 778000000 to 777998336 in
order to align on 2048-sector boundaries.
Setting name!
partNum is 13
REALLY setting name!
# call partprobe, for kernel partition table updating
[k0ste@ceph-osd5]$ sudo partprobe
4. Copy data from old partitions to new partitions.
[k0ste@ceph-osd5]$ sudo systemctl stop ceph-osd@73
[k0ste@ceph-osd5]$ sudo dd status=progress if=/dev/nvme2n1p11 of=/dev/nvme2n1p13
[k0ste@ceph-osd5]$ sudo dd status=progress if=/dev/nvme2n1p12 of=/dev/nvme2n1p14
5. Delete old partitions and set uuid for new partitions (uuid from old partitions).
[k0ste@ceph-osd5]$ sudo sgdisk --delete=11 --delete=12 --partition-guid="13:107B5B24-3A48-47A9-8686-595738D6235E" --partition-guid="14:E8E4EC9D-10B8-45E9-804B-017940554918" /dev/nvme2n1
# call partprobe, for kernel partition table updating
[k0ste@ceph-osd5]$ sudo partprobe
6. Final part: expanding of Bluestore devices.
[k0ste@ceph-osd5]$ sudo systemctl stop ceph-osd@73
[k0ste@ceph-osd5]$ sudo ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-73
infering bluefs devices from bluestore path
slot 0 /var/lib/ceph/osd/ceph-73/block.wal
slot 1 /var/lib/ceph/osd/ceph-73/block.db
slot 2 /var/lib/ceph/osd/ceph-73/block
0 : size 0x400d0000 : own 0x[1000~3ffff000]
1 : size 0x78009f000 : own 0x[2000~6ffffe000]
2 : size 0x5753b991000 : own 0x[29eabf00000~37e3b00000]
Expanding...
0 : expanding from 0x40000000 to 0x400d0000
0 : size label updated to 1074593792
1 : expanding from 0x700000000 to 0x78009f000
1 : size label updated to 32212905984
7. Result: block.db device is expanded.
[k0ste@ceph-osd5]$ sudo ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-73
infering bluefs devices from bluestore path
{
"/var/lib/ceph/osd/ceph-73/block": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 6001069199360,
"btime": "2018-06-28 14:44:38.392910",
"description": "main",
"bluefs": "1",
"ceph_fsid": "5532c4fd-60db-43ff-af9a-c4eb8523382b",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "73"
},
"/var/lib/ceph/osd/ceph-73/block.wal": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 1074593792,
"btime": "2018-06-28 14:44:38.432863",
"description": "bluefs wal"
},
"/var/lib/ceph/osd/ceph-73/block.db": {
"osd_uuid": "6387c7be-2090-4140-ad08-d34b023b47e7",
"size": 32212905984,
"btime": "2018-06-28 14:44:38.432187",
"description": "bluefs db"
}
}
Now you can start your osd.