VHPC QuickStart Guides - Office of Research

HPC (vhpc) New User Guide

During its development, the latest version of HPC at KSU was coined vhpc and it has persisted to its launch in 2025. The articles below are introductory level user guides and information for the vhpc and the vhpc as a KSU research computing core facility.

Getting Started

The new HPC (vhpc) is a Research Computing Core Facility. These core services have charges and requests for them need to be associated with a billable account.

To get stated with a service, follow these steps.

1. Set up a billable account (from your Award or KSU SpeedChart #), use the Research Core Billable Account Form
Note: this form can also be used to request a voucher for free credits.

2. Once you have a billable account, use the Research Core Service Request Form to add the vHPC service to your account.

3. Once your account is tied to the service, use Research Core Service Management Form to add users who can can charge to your account.

More about our core facility service forms.
Signing in to the vhpc
1. Download Global Protect form vpn.kennesaw.edu and use it to connect to vpn-groups portal.
2. Use your terminal app to start an ssh session.
  
  # ssh NetID@vhpc
On Windows, you can find PowerShell by searching within the Start menu. If you want more features useful with the vHPC, consider downloading MobaXterm.

On Mac, you can find the terminal.app in the Applications->Utilities folder. If you need X11 support, consider downloading the XQuartz app.
Transferring your files to the vHPC.

The easiest way to upload and download files to the vHPC is to use an SFTP application.

Cyberduck is a free application for Windows and Mac that can help upload/download/duplicate/rename files, create/remove directories and even Edit vhpc files with your local apps.
How to test your code safely.

After a successful login, users sessions start on the login node. Running computation on the login node is prohibited and will affect other users from logging in.

To request a "sandbox" space on a compute node, use the interact command.

Example 1: To start a new session on a node with 2 cores for 1 hour.
# interact -A yourAccount

Example 2: To start a new session on 1 node, with 4 cores, access to 1 GPU, for 2 hours and with support for X11.
# interact -N 1 -n4 -G -X -t 2:00:00 -A yourAccount

To learn more: type interact --help
What is Slurm?

Slurm is an open-source job scheduler designed for Linux and Unix-like systems, primarily used in high-performance computing environments. It manages resource allocation, job execution, and monitoring for computing clusters, making it a key tool for running parallel jobs efficiently.

QuickStart User Guide
https://slurm.schedmd.com/quickstart.html
HPC Glossary

A Node is a computer in the cluster meant for processing jobs.

A CPU is the part to the computer that performs computation.

A core is part of a CPU that executes an instruction

A task is commonly used in Slurm and is tied to cores and threads.

A GPU is an accessory of the computer that has its own chips and memory that allows it to specialize in linear algebra operations.

A job is a user's task(s) that requests and utilizes an allocation of resources to complete.

Walltime is part of the allocation request by a user for their job's duration.
Slurm Commands

While slurm system is composed of many components, the following commands are how most users will interact with the system:

sacct - report on accounting information for active or completed jobs

salloc - a real time request for resources that spawns a shell

sbatch - to submit job script for later execution

scancel - to cancel a pending or running job

squeue - report on the state of jobs

srun - used to submit a job for execution

sstat - report on resources utilized by a running job
PBS and Slurm
This section is not an exact science but these translations might help you early on.

Equating PBS & Slurm
- qsub & sbatch
- showq & squeue
- qstat & sstat
- interact is still interact
- See slurm's own rosetta stone
Old HPC & vhpc
- (batch, gpuq, himem) & defq
- scratch & work
- interact didn't change
- ksu-jupyter-notebook didn't change
Slurm batch file

#!/bin/bash

#SBATCH --job-name=your_job_name
#SBATCH --output=%j.out # name the output with jobid.out
#SBATCH --nodes=1 # nodes requested
#SBATCH --cpus-per=task=4 # number of cores to use
#SBATCH --partition="defq" # default queue
#SBATCH --time=0-01:00:00 # wall time requested days-hh:mm:ss
#SBATCH --mail-type="BEGIN,END,FAIL,TIME_LIMIT_90"
#SBATCH --mail-user="NetID@kennesaw.edu"
#SBATCH --account="your_accountID"
#SBATCH --gres=gpu:1 # requested no. of gpus
#SBATCH --mem=48G # requested memory

module load whateveryouneed

your_app your_data

exit 0

Other SLURM examples
vhpc limits
- walltime: 720 hours
- memory: 503 GB
- cpu: 144 cores
vhpc costs
- vhpc CPU: $0.03 per core hour
- vhpc GPU: $0.12 per GPU hour
- vhpc RAM: $0.004 per GB per hour
- NOTE: In determining cost, memory use is not added to the core usage, but rather the maximum is determined between them and is used.
Formula to calculate the cost of a job: max(GPUs * GPU rate + CPUs * CPU rate, Memory * Memory rate) * (End Time - Start Time)

Examples:
- 1 hour job (2 cores, 1 GPU, 24 GB RAM) = $0.18 due to CPU/GPU
- 1 hour job (24 cores, 0 GPU, 240 GB RAM) = $ 0.96 due to memory utilization
- 1 hour job (48 cores, 4 GPU, 503 GB ) = $2.01 due to memory greater than cpu/gpu is $1.92

vhpc storage

Storage	Description	Size	Soft/Hard Limits
Home	User home directories reside on stable storage that is backed up periodically. (/gpfs/home/e001/your_NetID) Home is suitable for small files (code & scripts) and software environments.	7.3 TB	25 GB / 30 GB
Work	Space optimized for speed during run time. Files should not be left in work. (/gpfs/data/e001/your_NetID) Consider using work only during run time of your jobs. Dat can be moved there from Stage or CCStar (Ceph storage).	172 TB	30 days / 44 days
Stage	Space for large files that won't fit in home, but which will be needed for jobs. (/gpfs/stage/e001/your_NetID) Consider using Stage to upload your latest data from your desktop or CCStar (Ceph) before you move a copy to the work directory for job run time.	122 TB	90 days / 104 days
CCStar (Ceph)	Space available for purchase at a rate of $28 per TB per year. (/ccstar/your_projectname) Consider using CCStar (Ceph) storage to house data for a longer terms than work or stage allow.	3 PB	1 TB / 1 TB

Daily automatic cleaning scripts will send out warning emails when soft limits are reached or if the file dates reach their time limits. Reaching the hard limits will result in deletion of files without further notification.

Learn more on the vHPC wiki.

Transferring from HPC to the vHPC cluster
Using command-line:
- To move one file to the vHPC:
  - Navigate to the file on the HPC
  - scp filename yourNetID@vhpc:/gpfs/home/e001/yourNetID
- To move a folder and its contents:
  - Navigate to the folder on the HPC
  - scp -r foldername yourNetID@vhpc:/gpfs/home/e001/yourNetID/
Note: user home directories are limited to 25 GB of space. Plan to use the stage volume accordingly when transferring your data.
- To copy all of your home contents (if home < 25GB total) from HPC to you vHPC home:
  - rsync -av ~ yourNetID@vhpc:/gpfs/home/e001/yourNetID
- To copy all of your home contents (if home > 25GB total) from HPC to a vHPC folder called "from_hpc" on the stage volume:
  - rsync -av ~ yourNetID@vhpc:stage/from_hpc
Using a SFTP application (Ex. cyberduck):
- Open a separate SFTP window to both servers.
- Drag and drop folders and files from HPC to vHPC.

Other QuickStart Guides

Coming Soon