allegro
cluster
allegro is a hybrid x86 and GPU cluster and consists of a total of ~1000 cores and roughly 3.5 TB RAM located in different
node types.
Debian Bullseye OS, with
Saltstack managed
slurm.
The nodes are interconnected via
InfiniBand for parallel computation jobs (
MPI) and equipped with ample 250TB of cluster-wide
hard disk storage.
Job Management
On both Clusters Calculations are scheduled and automatically distributed via the
Slurm Workload Manager
Getting started
For the impatient, you will find a quick run through
here
Getting an account
You need an account at
Freie Universität Berlin which has been enabled at the
Department of Mathematics and Computer Science
Logging In
The cluster is only reachable from within the Department of Mathematics and Computer Science.
If you want to get access to the cluster from outside the department,
please login on one of our ssh remote login nodes and then jump to allegro or use a ssh tunnel.
To login to a cluster, you need an SSH client of some sort. If you are using a linux or unix based system, there is most likely one already available to you in a shell, and you can get to your account very quickly. For Microsoft Windows, we recommend
PuTTY
.
Direct Login
$ ssh <username>@allegro.imp.fu-berlin.de
Login via SSH-Tunnel
$ ssh -f -L 9999:allegro.imp.fu-berlin.de:22 -N <username>@andorra.imp.fu-berlin.de
$ ssh <username>@localhost -p 9999
Storage
The cluster is equipped with a small /home partition for scripts and profile data and about 250TB of cluster-wide
hard disk storage for computational data.
Cluster nodes can
not access the nfs filesystems mounted on typical workstations (e.g. /home, /storage, /group) in the department of Mathematics and Computer Science.
Please find more details concerning storage
here.
Submitting jobs
Save the following as
job_script.sh
and replace the
<USER>
and
<EMAIL ADDRESS>
placeholders.
Please note: There is no routing queue so you have to specify which partition you want to use.
#!/bin/bash
#SBATCH -J testjob
#SBATCH -D /data/scratch/<USER>
#SBATCH -o testjob.%j.out
#SBATCH --partition=micro
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=10M
#SBATCH --time=00:30:00
#SBATCH --mail-type=end
#SBATCH --mail-user=<EMAIL ADDRESS>
hostname
date
A list of usable partitions and their restricitions is available via
scontrol show partitions
$ scontrol show partitions
or
here
You can get a list of available partitions via
sinfo -Nel
$ sinfo -Nel
Then, it's time to start the job via
sbatch
$ sbatch job_script.sh
You can see your currently running jobs with
squeue
$ squeue
You can cancel your currently running job with
scancel
$ scancel <job-id>
Cook Book
- Selecting Nodes Classes
- selecting node classes, required for consistent running time results.
- Selecting Queues
- selecting queues for short / long running jobs.
- Job arrays
- allow you to submit a sequence of similar job scripts that only differ by one environment variable (
${PBS_ARRAYID}
) (advanced)
Notes
- The
squeue -u
command's output is not in real time.
- Use
scontrol show job
to get detailed info about a job.
- Useful:
-t
flag for array execution. Use PBS_ARRAYID=1 bash job_script.sh
to simulate one of array execution locally.
- If you specify
-l nodes=1
then you will NOT get the node exclusively. Use -l node=1:ppn=24
.
- If you call a script in your job script then this is not cached. Do not modify included scripts or be aware of the side effects!
- Since September 12 you can run X11-Programs (ssh -X allegro…) and see the WIndow on your Workstation.
File System Paths
The following file system locations are interesting:
Service |
Comment |
Quota (soft/hard) |
Backup |
/home/$username |
Extra User's home on allegro, fast infiniband-connected hard drives. |
20G/25G |
Daily Zedat |
/nfs/$normal-path |
Things that are normally available through the network. For example: /nfs/group , /nfs/home/... , … |
see ServicesFile |
see ServicesFile |
/data/scratch |
Scratch directory for temporary data. It is a good idea to set your TMPDIR environment variable to /data/scratch/$USER/ after creating this directory. |
no-quota |
NO Backup |
/data/scratch.local |
Local scratch directory for temporary data in case you need the speed of a local disk -- beware the limited space. |
no-quota |
NO Backup |
Data can be copied from the
/nfs/...
paths to the home directory on allegro.
Results/Datasets you want to keep should be moved to /nfs/group/
Cluster Queue Commands
ClusterQueueManagement
The cluster queue is managed by the Slurm Workload Manager.
The system knows the following Quick Start (besides others):
Cluster Resource Policy
- The time limit that you give to your jobs is 'hard'. Jobs will be killed if they would still run but your runtime is up.
- The memory limit that you give to your jobs is 'hard', too. Same as above with time. If you allow the job to have 1024Mb but it uses 1034, it will be killed immediately. The message that you will receive for such an event will read like this: 'job violates resource utilization policies'.
- The core limit you give to your job is 'hard', too. This means that if you set the number of cores/threads to 2 but start 4 threads, you will have 4 threads running each getting 50% of CPU time
transition from torque
Before May 2015, allegro was running another scheduler ( torque).