Caelinux cluster

  • Filippo Monari
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
13 years 7 months ago #4667 by Filippo Monari
Caelinux cluster was created by Filippo Monari
Hi, I'm going to change my pc (pentium quadcore) and i knew the possibility of create a computational cluster, connecting two or more pc in a network. Can anyone tell me if CAEliniux support this system and suggest me any guide about how realize such thing.
Thank you in advice.

Please Log in or Create an account to join the conversation.

More
13 years 7 months ago #4679 by CAVT
Replied by CAVT on topic Re:Caelinux cluster
I think it's not an issue if CaeLinux can or cannot, which I think it should be able since it's more than anything a network, but rather that you use and set up the libraries recquired for paralell computing.
If you check for example C_S official site there you will see many libraries that they reccomend (or even recquire) for clustering, one of them being MPI (OpenMPI is one version usually available in repositories).
I'm afraid I cannot be of great help in setting up this sort of configurations, but instead of going for a two PC cluster, have you considered better only one PC with two CPUs, or maybe an 8core PC? Perhaps it will be cheaper, and less complicated to set up.

Please Log in or Create an account to join the conversation.

More
13 years 7 months ago #4680 by JMB
Replied by JMB on topic Re:Caelinux cluster
CAVT wrote:

I think it's not an issue if CaeLinux can or cannot, which I think it should be able since it's more than anything a network, but rather that you use and set up the libraries recquired for paralell computing.
If you check for example C_S (C_A?) official site there you will see many libraries that they recommend (or even require) for clustering, one of them being MPI (OpenMPI is one version usually available in repositories).


I believe OpenMPI is already installed on CAELinux2010 (try 'locate openmpi') and its version of CA is compiled to run on multiple cores as well. I am reasonably certain that it is only a matter of getting the configurations setup correctly to get a multi-PC cluster functional using CAELinux as the basic starting point. Perhaps CA will have to be recompiled...

Although, I have yet to get it working, I am trying to get there and so cannot offer any more specifics.

Regards,
JMB

Please Log in or Create an account to join the conversation.

More
13 years 7 months ago #4683 by Joël Cugnoni
Replied by Joël Cugnoni on topic Re:Caelinux cluster
Hi
actually, CAELinux already contains several codes compiled with OpenMPI like Code Aster 10.1, Code Saturne 2rc1, OpenFOAM 1.7, Elmer FEM 5.5, Gerris flow solver and Impact.

These codes should all be capable to run on multiple machines, but they all have different procedure to submit a job so you will need to read each manual carefully.

However, CAELinux is ready to build a cluster, you just need to configure it properly. By default, ssh is already configured correctly for local connections without password. To build your cluster, you mostly need to extend this config to have global passwordless connexions with ssh.

So, to build a basic cluster with CAELinux, you would mostly need a bunch of PC with a gigabit ethernet switch and the following procedure:
1. install CAELinux on each node, give different hostnames (node0,node1,...), but same login/password!

2. setup static ip adresses for your nodes and /etc/hosts to list the ip/hostnames so that each node can resolv the hostname of the other nodes

3. setup ssh connexion without password between each node :
a. generate the list of known host keys, on 1st node (repeat for all nodes):
[code:1]
ssh-keyscan -t dsa,rsa node1 >> ~/.ssh/known_hosts
ssh-keyscan -t dsa,rsa node2 >> ~/.ssh/known_hosts[/code:1]

b. copy the folder /home/youruser/.ssh of the 1st node to all other nodes with: (on node0)
[code:1]scp -r ~/.ssh node1:~
scp -r ~/.ssh node2:~[/code:1]

c. test: ssh node1 should allow you to connect without password/confirmation..

4. you may want to mount a shared folder within your cluster. A simple system could be to mount on each node a common directory stored on node0 using sshfs. On node0:
[code:1]mkdir ~/shared
ssh node1 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
ssh node2 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
... [/code:1]
(sshfs is not restored after rebooting, so you will need to repeat this step or modify /etc/fstab..)

Then, read the manuals of the code you want to run over the cluster and see which config files must be updated and how to launch MPI jobs. For Aster, I know that you need to edit /opt/aster101/etc/codeaster/aster-mpihosts and maybe /opt/aster101/etc/codeaster/asrun

I have not tested all these possibilities yet, but I have built several small clusters like this.

Let us know about your experience as I am pretty sure a lot of people are interested

Joël Cugnoni - a.k.a admin
www.caelinux.com

Please Log in or Create an account to join the conversation.

More
13 years 7 months ago #4684 by JMB
Replied by JMB on topic Re:Caelinux cluster
Administrator wrote:

4. you may want to mount a shared folder within your cluster. A simple system could be to mount on each node a common directory stored on node0 using sshfs. On node0:
(sshfs is not restored after rebooting, so you will need to repeat this step or modify /etc/fstab..)


Hello Admin,

Thanks for the details! I am 3/4 of the way there. One question:

1. Why do you recommend sshfs? Would NFS not suffice so long as it is within a local subnet? I am currently utilizing NFS mounts within my network and wondered if I needed to specifically set up a different sub-dir under sshfs?

Regards,
JMB<br /><br />Post edited by: JMB, at: 2010/09/07 14:49

Please Log in or Create an account to join the conversation.

More
13 years 7 months ago #4685 by Joël Cugnoni
Replied by Joël Cugnoni on topic Re:Caelinux cluster
Hi JMB

actually NFS is a better choice for systemwide network sharing.
SSHFS is different as it can be mounted by simple users, so it depends on the use case.

If you need permanent shared folders, NFS is the best choice. If you just want a &quot;quick and dirty&quot; network share for a single parallel run sshfs is more flexible and does not require admin rights.

Performance of NFS is probably also much higher than sshfs.

Let us posted about your progress

Joël Cugnoni - a.k.a admin
www.caelinux.com

Please Log in or Create an account to join the conversation.

Moderators: catux
Time to create page: 0.142 seconds
Powered by Kunena Forum