HPC: Boston University

Guidance portal

Cheatsheet

Logging In

After setting up your account password (see your email)

ssh username@scc1.bu.edu

Once logged in you can interact with the terminal like normal (Try ls, mkdir, cd).

mkdir BIO331
cd BIO331

The “Cluster”

BU Cluster organization

CPUs are not individually powerful enough to handle large workloads from multiple users. Instead, HPCs are built by stringing together many CPU Nodes into a network of computers that can be accessed together.

This means that we have to think about computing on these resources in terms of “jobs” instead of interactive typing at the terminal and console. To get there the code we run has to be packaged into scripts that are fully reproducible. That way code developed on your laptop (or the course RStudio server) can be moved to the HPC compute environment.

The dispersed nature of HPC work also means that users have to work with queueing software that schedules jobs for us.

The “Queue”

The BU cluster uses PBS style queueing software, a common job management system.

View running and queued jobs (checking on jobs or compute load):

qstat

Storage

You are limited in space in your home directory. You should instead work in our class project folder where you have up to 50GB of storage.

cd /project/ct-shbioinf

mkdir username
cd username

You should mostly be moving scripts and small data files to the HPC with git. The simplest thing to do is to put code that you plan to run in a git repository and then move it to the HPC with the command ‘git clone’. e.g.,

git clone https://github.com/rsh249/bio331_geospatial
cd bio331_geospatial

#just check where you are
pwd

When you make updates to this code you will want to run the following code to pull updates from the git repository.

git pull 

Software

Since there are many users on the HPC system and each of them might need access to different softwarer (programs/versions/etc.) there are very few programs installed by default. Instead, to use a program you will either need to install it locally or use the system installed software modules (use the modules).

#look at available modules
module avail

#load a current version of R
module load R

#Run R
R

Next steps

We will develop scripts that can be run on the HPC and explore more of the tools needed to work on the server.

home