Storage¶
Storage Overview¶
There are four complementary filesystems where you can store your data.
Name | Path | Alias | Backup | Purged | Usage | Disks |
---|---|---|---|---|---|---|
home | /home/$USER | ~ | No | No | 15 GB, 100k files (limit) | Redundant, SSD |
data | /data/$USER | ~/data | No | No | 200 GB (limit) | Redundant, SSD |
scratch | /scratch/$USER | ~/scratch | No | 30 days | 20 TB (limit) | Redundant, HDD |
scalable storage | /shares/<PROJECT> | see below | No | No | with cost contribution (quota) to increase quota, contact us | Redundant, HDD |
Filesystems¶
Tip
To see an overview of your storage usage, run the command quota
from a login node.
Warning
If you exceed the quota, you will not be able to write to that file system. Check the FAQs for support on how to clean up your file storage if you are over-quota.
Home¶
Each user has a home directory where configuration files, source code, and other small important files can be stored. The directory has a limit of 100,000 files and/or 15 GB of used space. The quota makes it impractical for large data storage or software installations.
Data¶
For persistent storage of larger files, you can use the data filesystem (~/data
or /data/$USER
). It has a limit of 200 GB and it is not backed up (as is the case also for the other storage). This filesystem is also appropriate for software installations.
Scratch¶
The scratch filesystem (~/scratch
or /scratch/$USER
) is for the temporary storage of large input data files used during your calculations. Each user has a quota of 20 TB. The maximum file size is limited to 10 TB. Please note that this filesystem is meant for temporary storage only. According to the service agreement, any files older than 30 days are subject to deletion.
Scalable Storage¶
Scalable group storage requires a cost contribution and is based on the actual usage. The default permissions are set so that each member of the project has access to the shared folder, which can be found at this path: /shares/<PROJECT>
. (In this case, replace <PROJECT>
with your actual project name.)
You can create a symlink called shares
in your home directory that points to this shared group folder:
ln -s /shares/<PROJECT> ~/shares
Further Storage Options¶
UZH Central IT (ZI) offers further storage options. For more details, please check General Topics → Data Storage.
- Data Archiving
- Data Publishing
- Network Storage (SMB- and NFS-based) with high availability and backup
- For connection instructions to SMB share, please check article How to connect to a UZH NAS from the ScienceCluster.
- Connection instructions to NFS share will follow.
FAQs¶
I am over-quota. How can I clean up my file storage?¶
Consider storing large files in your scalable storage folder, which is in your project space and can be found by running the command quota
.
Folders that typically grow in size with cache or temporary files are .local
and .cache
. To find the storage used (in GB) in all subfolders in your /home/$USER
and /data/$USER
folders, run:
ls -lha
In addition, you may want to check the number of files in your /home/$USER
directory with:
cat /home/$USER
The total number of files and directories will be shown as rentries
and it may not exceeds 100,000.
If you cannot login anymore into the cluster you can still connect using terminal only:
ssh -t <shortname>@cluster.s3it.uzh.ch bash
Anaconda / Mamba¶
To clean up cached installation packages from Anaconda, run the following commands:
module load anaconda3
conda clean -a
pip cache purge
Or with Mamba:
module load mamba
mamba clean -a
pip cache purge
Singularity¶
Check Singularity cache set up and clean up instructions in this article.
Framework folders¶
Certain software frameworks (e.g., HuggingFace) cache files programmatically, which can be cleaned with their own commands. For example, with HuggingFace consider using:
huggingface-cli delete-cache