Good fault tolerant and distributed fs

g2user2024 · n00b Joined: 19 Nov 2024 Posts: 4

Looking for a open-source gentoo packaged cluster filesystem. The last forum post about this is from around 2015.

The goal is for a small development cluster to be able replicate data and to handle a system/disk being unavailable without clients noticing/pausing. To ease vm management and reloaction, the vm images would be on the cluster file system. The filesystem must also support NFS-like semantics (almost posix, except for file locking).

The hardware (s1-s6 are the server names):

szatox · Advocate Joined: 27 Aug 2013 Posts: 3493

pingtoo · Posted: Wed Nov 20, 2024 4:56 pm Post subject:

A interesting topic. I am not involve in Gentoo at all. Just a Gentoo user. I used to be an infrastructure architect prior retirement.

your description seems to mixing Block device and File system together. IMHO they should be discussed separately. May be you can describe a bit more about the access pattern (as in which is client and which is server).

I gather you infrastructure is pure ethernet, there is no other layer two device (FC for example) can be used in this project?

have you tried with jumbo frame? and/or RDMA?

Do you (the project) have full control of network? (or there is separate network team management involve)

g2user2024 · n00b Joined: 19 Nov 2024 Posts: 4

Thanks for the replies.

Some additional info: This is a development enviroment separate from the production data center. There is no dedicated infra admin and things have evolved over 15 years. Think of a wire rack in the corner of a storage area. There are space, power and budget constraints for hardware changes and this project winds down Dec 2025.

Without a dedicated admin, we have been putting services into vms to simplify service management. The vms run development versions of web apps (go-based), automated tests against the apps, proxies to the internet, code building, nfs for video files and general data, web-mail clients, binpkg builders, binhosts, backups, etc. The vms have their own addresses on the lan (and san where needed).

We have custom filesystem-based tools for creating vm images, chroot setup using a vm image file, cloning from a directory or a running vm, and performing package updates from binhosts. So RBD storage of the vm images doesn't work with the tools we have. The vms run as openrc services. This allows setting dependencies on vm startup.

We have been running the general-usage nfs server in a vm to simplify ip management as the ip just moves with the vm. So no need to also manage floating ips that are associated with physical hosts.

The qemu+kvm block and net devices as seen on the vms (via lspci):

SCSI storage controller: Red Hat, Inc. Virtio block device
Ethernet controller: Red Hat, Inc. Virtio network device

@szatox

I agree that CEPH has the features we want. It looks like CEPH is more for larger sites with a dedicated admin and more hardware resources than we have. Moving the network to 10G isn't workable for us since the existing hardware can't be upgraded to that.

The CEPH test had 1 active mon and 2 standbys, 3 active osds, 2 mds. All using 7200 rpm sata disks. So looks like a setup bound to perform poorly.

@pingtoo

We have control over the local assets/network, but there is no budget for serious upgrades. The six servers (s1-s6) host ~40 vms, some of which are infrastructure. The clients do sw development, office apps, web email, graphics & some video editing. There are at least a few >10GB file copies a day.

There are three isolated 2.5g switches for the lan, san and dmz networks. The lan nics for servers s1-s6 are all on the same lan switch, the san nics for servers s1-s6 are all on the same san switch. The clients (laptops, desktops) connect to the lan via a secondary 2.5g switch.

On the lan, server-server ping times average ~0.55ms with spikes up to 2.5ms. Server to vm ping times average ~0.675ms. Client ping times are usually 0.030ms higher. The san ping times are about the same as the lan. The host kvm net devices use vhost-net, the vms virtio.

The nics/switches in theroy support jumbo frames. Used ip link set mtu 9000 dev netX on all nics connected to the san network. No noticable difference in thoughput.

pingtoo · Posted: Fri Nov 22, 2024 12:57 am Post subject:

szatox · Advocate Joined: 27 Aug 2013 Posts: 3493