Project

General

Profile

Wiki

Quick help

HPC cluster access

You can request access to our HPC cluster systems by filing a request to the BioinForMa service. Few easy steps:
  • come to our offices (third floor, near the canteen, server room corridor)
  • we will provide you a form to fill with some informations and sign, read it carefully!
  • you will receive your access credentials. If you have a mailbox @szn.it you will get access to our ticketing platform too. The ticketing platform is the main way to ask for support, software installation, bug reporting and troubleshooting
At the moment we have two HPC cluster:
  • Falkor is the stable (production stage) cluster (8 nodes, running on Debian GNU/Linux)
  • Kraken is the old-stable (production stage) cluster (2 nodes, no hyperthreading). Kraken is now obsoleted and its use is deprecated. Please, use Falkor for your tasks.

All the accounts are synced between the two HPC clusters. The user home directory is mounted on both the clusters.

Falkor access

  • You can access Falkor via SSH at host 192.168.16.35
    ssh <your_username>@192.168.16.35
    
  • Access is enabled only from the internal SZN LAN
  • You can transfer files from and to the HPC cluster via scp
  • We are not enforcing an user quota on your home, but we are monitoring usage. Once in a while we could ask you to operate a cleanup to make room on the storage. We rely on a policy of fair usage of resources at the moment

Kraken access

  • You can access Kraken via SSH at host 193.205.231.59
    ssh <your_username>@193.205.231.59
    
  • Access is enabled both for the internal SZN LAN and for external networks
  • You can transfer files from and to the HPC cluster via scp
  • We are not enforcing an user quota on your home, but we are monitoring usage. Once in a while we could ask you to operate a cleanup to make room on the storage. We rely on a policy of fair usage of resources at the moment

Main rules

  • All your jobs must be submitted to the scheduler queue. Our cluster uses the SLURM workload manager for job queuing. Please, have a look at the documentation below
  • As a consequence, please do__not__start jobs on the HPC cluster frontend

HPC user environment

  • Our HPC cluster use
    module
    to setup software environments according to the needs. Type
    module avail
    to get a list of available environments
  • To use a module you can use the "module load" syntax. E.g.: to load bowtie 2.3.1 environment
    module load bowtie/2.3.1
    If you want to unload a module:
    module unload bowtie/2.3.1
  • You are very welcome to suggest and propose new module files. You can even prepare your own modulefiles. You can find a quick guide at this URL

Software

  • There is a moltitude of software installed at the moment. Have a look at this page for more details
  • You can request the installation of new software. According to the software programming language and/or software dependencies, we will find the optimal way to install it. This process is usually cooperative: please expect to be asked for feedback

Workload manager: SLURM

  • You are asked to launch your jobs using the default workload manager, SLURM. SLURM is a very flexible and customizable software with a moderate learning curve. Please, have a look at the SLURM documentation. For a quick guide have a look at this page also
  • To get informations about nodes current status and the available queues you can type the following command in a bash shell on the cluster:
    sinfo
  • We are giving some sample SLURM batch file on this page
  • SLURM and the MPICH2 talk each other in different ways. An early knowledge base is here

Data backup

  • The HPC cluster is now running quite stable after having suffered bad faults in the past. At the moment, we have no backup facilities. Please, keep a copy of your data.

Programming languages knowledge base

This section will collect documentation about language specific tasks, parallelization and task running under SLURM

Quick links

Some quick links you can reach from the internal SZN network segment (at the moment):