Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Making a pseudo-cluster; sort of an AFS question
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
SilverDirk
n00b
n00b


Joined: 06 Aug 2004
Posts: 32

PostPosted: Wed Jan 11, 2006 5:54 am    Post subject: Making a pseudo-cluster; sort of an AFS question Reply with quote

Suppose that a university computer club has a large number of Sparc Ultra 1 systems, which can't be effectively used for desktops. Suppose this group has a slower-than-desired webserver, and would like to do something useful with the Sparcs, like make a cluster to replace the webserver, and also serve various other things, like shell accounts and mail and whatever else comes to mind. Suppose that said computer club has zero funding, but lots of aggregate free time.

OpenMosix would be ideal, but only works with x86. We are supposing that emulating x86 would result in patheticly slow performance. So.. what we were condifering is setting up a distributed filesystem and doing the load-balancing manually, to get a pseudo-cluster. (by "manually", I mean have an overly-clever firewall that forwards connections to random nodes which are running a server for whatever service is needed. i.e. with ssh you would end up on a random node, but still have all your files via the distributed file system)

I'm looking for suggestions. The best idea so far is to use OpenAFS, and try to get things like /usr/portage and /usr/share and basicly everything other than /etc into the shared pool of files.
  • Catch number one: the HDDs are small, and already formatted with reiser, and we'd probably want to put the server/client partitions in files via the loopback interface.
  • Catch number two: each machine would be both a client and a server, and so could end up with its own files cached on its client partition. This works, but is wasteful.
  • Catch number three: it sounds like getting services (or just suid programs) to work in kerberos/afs land is painful, and will require a lot of maintainance to keep it working. We're imagining qmail, apache (php sessions, posts to forums), stuff relating to administration with sudo, etc. All in all, looking unpleasant.

So what are some other options for clusters, and distributed filesystems? Parameters: possibility of secure firewalled subnet, 2-4GB storage per machine, slow processors, in the ballpark of 10 machines.

Thanks
Back to top
View user's profile Send private message
SnakeByte
Apprentice
Apprentice


Joined: 04 Oct 2002
Posts: 177
Location: Europe - Germany

PostPosted: Sun Jan 15, 2006 10:18 pm    Post subject: Reply with quote

Hi SilverDirk,

some suggestions here but also additional questions:

Do you have access to hardware and configuration of the machines?

Putting more hard disk space in two master machines
and boot / run the other via NFS would be an option to consider.


# world #


# Master Box 00 # # Master Box 01 #


# node 01 # # node 02 # # node 03 # # node 04 #

# node 05 # # node 06 # # node 07 # # node 08 #


Use software raid for the application and boot partitions, ideally with a dedicated second switch
between the masters for a raid over nbd setup for fail safe.

As you are short on space and processing power try to optimize the software for size.

Build a complete system with all services installed in a chroot and export it via nfs.
Create node dependent startup scripts for the services that do not like parallel execution.

Create a seperate data partition and export this too.

Configure the packet filter on the masters to round robin forward the requests from the outside,
leave the webserving, mailhosting, spamassassing, virus scanning to the nodes.

Maybe use squid on the masters for the static parts of your web data.
Use dedicated cache partitions and neighborhood talking for squid.

Use NFS root for the nodes ( these boxes should be capable to netboot )
mount /tmp on tmpfs
use a system logger that can send its logs to the net and store them on the masters


kind regards
Back to top
View user's profile Send private message
SilverDirk
n00b
n00b


Joined: 06 Aug 2004
Posts: 32

PostPosted: Mon Jan 23, 2006 1:37 am    Post subject: Reply with quote

Thanks for the ideas. I like the idea of putting all the storage in the master box, but the desktop-cases we have can only hold 3 drives (and only have that many scsi connectors). The drives we have are all between 2G and 4G, so we're looking at a master box with maybe only 12G of storage. In order to use more drives, we would need to NFS export them from the nodes (or set up more master boxes each time we wanted to add storage).

This is why we liked the idea of AFS. It looks like it can let us reconfigure the storage of files between the nodes without interrupting the filesystem, and without having to reconfigure each node. So, if we get a new node, we just connect it up, and specify which files it should store, and it becomes the server for them.

I was sort of wondering if anyone had experience trying to run servers (httpd, qmail, etc) from nodes on AFS, and whether it was more trouble tan it was worth. If it isn't practical, we can just fall back to NFS.
Back to top
View user's profile Send private message
stefaan
Retired Dev
Retired Dev


Joined: 31 Aug 2005
Posts: 35

PostPosted: Wed Jan 25, 2006 10:33 am    Post subject: Reply with quote

If I'm not mistaken, there's whole universities running on AFS, so there surely is a merit for a cluster of more than a couple of machines in there.

One big difference you need to have in mind is this: NFS has no user authentication, and as such, running a webserver on a single server or a server with NFS space is no different. Being root on an AFS-client machine doesn't give you any access to the files, except when you authenticate for it. This means that you will need accounts for a mail- and web-server, and have to specifically give access to the webserver-account to read your web-shared files / the mail-server to write to your mailbox.

About catch number one: reiserfs can NOT be used for the client-side cache. You should however be able to put any server partition on it though.
I have no real idea about catch number two. But if you don't have a real abundance of disk-space, I would not make the cache too large anyway. It's not like random access works well with any kind of cache.
Catch number three: see above. I don't know why suid programs would be a problem (you just keep the PAG / ticket). Sudo is a bit more intrusive but still, you may want to look at "pagsh" (it sort of binds the ticket to your process and everything that spawns from it). In my opinion, it gives you extra security, and tools are provided to do any necessary tasks. Of course there's a learning curve, as with all things.

Let me know how it turns out, I'm really interested (I actually planning on experimenting with openafs at a local shell/web/mail-hosting university club)

Regards,
Stefaan
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum