Skip to content

GLADE file spaces

Overview

The Globally Accessible Data Environment – a centralized file service known as GLADE – uses high-performance Spectrum Scale and Lustre shared file system technology to give users a common view of their data across the HPC, analysis, and visualization resources that CISL manages.

Data can remain in each of these spaces in accordance with the policies summarized in the following table and detailed below. The policies are subject to change; all changes will be announced in advance. Note that "Not purged" does not mean that your files can or will reside in these spaces indefinitely, only that retention depends on your particular situation. Refer to the detailed retention practices and policies for each file space.

As disk space is a limited resource shared by the entire user community, long-term storage of infrequently accessed data should be avoided. These spaces should be used as efficiently as possible so that other projects can take advantage of the resource. The Quasar tape storage system is more appropriate for long-term storage of infrequently accessed data.

File space

User quota

(Size/Count)

Backup Purge
policy
Descriptions/notices
Home:
/glade/u/home/<username>
50 GB Yes Not purged User home directory.
Ideal for small scripts, source code, and configuration files that benefit from backup.
Scratch:
/glade/derecho/scratch/<username>
30TB / 10M No PURGED
180 days
Temporary space.
Derecho's scratch file system also includes a limit of 10 Million on a users' total number of files
Work:
/glade/work/<username>
2 TB No Not purged User work space.
Ideal for compiled code, conda environments, and similar large holdings that do not require backup.
Campaign Storage:
/glade/campaign
N/A No Not purged Project space allocations (via allocation request)
GLADE system status report

Best Practices

Check your space usage regularly with gladequota.

Check your space usage regularly with gladequota as described below, and remove data that you no longer need.

Conserve space with large files.

You can conserve GLADE space by storing large files, such as tar files, rather than numerous small, individual files. This is because the system allocates a minimum amount of space for each file (currently configured to 16KB), no matter how small the file is. Thus 16KB is the smallest amount of space the system can allocate to a file, including empty files, directories and symlinks.

Status

Home space

Each user has a /glade/u/home/<username> space with a quota of 50 GB* for managing scripts, source code, and small data sets. It is backed up weekly, with backups retained for several months. CISL also creates snapshots of the space to enable users to recover deleted files quickly and easily.

* As discussed below, the system stores dual copies of users' data for increased data integrity and safety. For this reason some space reporting tools will show 100 GB due to this replication.

Backup policy

  • Your /glade/u/home directory is backed up several times a week while your account is open. Each backup is kept for several weeks. The frequency and scheduling of backups and the length of time they are kept may change with no prior notice.
  • If you are unable to find files that you would like to restore in the snapshots of your home directory, contact the NSF NCAR Research Computing help desk to request restoration of the files if they are in a backup.
  • Core dump files are not backed up. Core dump file names typically follow this format: core.xxxxx (where the extension can include from one to five digits).
  • CISL does not purge or scrub your home directory, and deletes files only as stated in the following data retention policy.

Data retention policy

  • User accounts are closed 90 days after your last active project expires or 90 days after your account is removed from all active projects.
  • When your account is closed, files will remain in your home directory for 30 days, after which CISL backs up the final contents and removes them from the file system. This backup is retained for six months after account closure.
  • Note that your project and group memberships are also terminated when your account is closed, which can limit other users' ability to access your files based on shared group permissions.
  • As long as your account remains active, files are retained in your home directory.
  • Again, core dump files are not backed up.

Work space

Your /glade/work/<username> space is best suited for actively working with datasets over time periods greater than what is permitted in the scratch space. The default quota for these spaces is 2 TB. This space is not purged or scrubbed. CISL deletes files only as stated in the following data retention policy.

Data retention policy

  • User accounts are closed 90 days after your last active project expires or 90 days after your account is removed from all active projects.
  • When your user account is closed, files in your work space are retained for an additional 30 days before being deleted.
  • Files are not recoverable from backups, as there are none.
  • As long as your account remains open, your work directory files are retained.

Scratch file space

Each user has a /glade/derecho/scratch/<username> space by default, with an individual quota of 30 TB. The scratch file space is intended to support output from large-scale capability runs as well as computing and analysis workflows across CISL-managed resources. It is a temporary space for data to be analyzed and removed (or automatically purged).

If you will need to occupy more than your quota of scratch space at some point, contact the NSF NCAR Research Computing help desk to request a temporary increase. Include a paragraph justifying your need for additional space when making your request.

Purge policy

Individual files are removed from the scratch space automatically if they have not been accessed (for example: modified, read, or copied) in more than 180 days. A file's access time - (atime) is updated at most once per day for purposes of I/O efficiency. To check a file's atime, run ls -ul filename.

Warning

Users may not run touch or similar commands for the purpose of altering their files' timestamps to circumvent this purge policy.

CISL staff will reduce the scratch quotas of users who violate this policy; running jobs may be killed as a result.

CISL routinely monitors scratch space usage to ensure that it remains below the 90% mark and to determine if a reduction in the retention period is necessary. We will announce in advance any changes to the retention period.

To help us avoid the need to shorten the retention period, please use this space conscientiously. Delete files that you no longer need as soon as you're done with them rather than leave large amounts of data sitting untouched for the full 180 days. If you need to retain data on disk for more than 180 days, consider using your /glade/work space or Campaign Storage space.

Data retention policy

  • Whether your account is open or closed, files are purged from your scratch space automatically when they have not been accessed in 180 days.
  • Files are not recoverable from backups, as there are none.

Campaign Storage space

Dedicated spaces on the /glade/campaign file system are available to project teams through our allocations process to support longer-term disk needs that are not easily accommodated by the scratch or work spaces. The allocations are based on project needs and resource availability. Requests are reviewed according to the various allocation schedules.

If you have a user account associated with an allocated project but lack the directory permissions you need for that project’s Campaign Storage space, contact the NSF NCAR Research Computing help desk to request changes. Identify the directories and the permissions you are requesting.

See here for additional technical details on the Campaign Storage file system.

Access reports

CISL generates weekly usage reports throughout /glade to help users manage their data. The reports provide a summary of when files were last accessed, how much space is used, and details for the top 25 users. The files are named access_report.txt and can be found at the top-level of shared file spaces, for example /glade/campaign/<group_name>/access_report.txt (or similar, depending on the project).

Campaign Storage allocations

The duration and use of a project’s Campaign Storage space is controlled by NSF NCAR allocation policies. Specifically, the following policies apply to Campaign Storage allocations:

  • Campaign Storage is not intended to provide long-term or permanent preservation of a project’s data. For university users, long-term preservation of project data remains the responsibility of your home institution or the lead institution on the funded NSF award. Allocation requests should describe the PI’s plan for dealing with Campaign Storage data after the NSF award expires.
  • Campaign Storage quotas will not be expanded to allow data from an earlier, expired project to be retained under a new NSF award. That is, researchers may not use a new NSF award to preserve data from an earlier award. Campaign Storage space requests must be justified in terms of the work associated with the active NSF award.
  • Campaign Storage allocations (as with computing allocations) can remain active for the duration of the project’s funded NSF award, including any no-cost extensions.
  • With suitable justification, CISL may be able to extend a Campaign Storage allocation for an additional 12 months beyond the end of the NSF award. Suitable justification may include papers or other works proceeding through the publication review process. CISL reserves the right to decline these extra extensions or reduce Campaign Storage quotas after the end of the NSF award in order to ensure that disk space remains available to the community of researchers with active NSF awards.
  • Once a project has exhausted all policy options and ultimately expires, files are removed and the Campaign Storage space is reclaimed according to the following data retention policy.

Data retention policy

  • 90 days after project expiration, project directories are emptied and scheduled for deletion.
  • Individual users' files are not deleted from project space after their accounts are closed when the project remains active.
  • Files are not recoverable from backups, as there are none.

Checking space usage

Knowing your quotas and usage is important.

All files that you own in the home, work, or scratch spaces are counted against your quota for that space, regardless of the directory in which they are stored. That is, if you write files to another user's home or scratch space, for example, they still count against your own quota for that space.

If you reach your disk quota for a GLADE file space (see gladequota below), you may encounter problems until you remove files to make more space available. For example, you may not be able to log in, the system may appear hung, you may not be able to access some of your directories or files, your batch jobs may fail, and commands may not work as expected.

If you cannot log in or execute commands, contact the NSF NCAR Research Computing help desk. You can check your space usage as shown below.

gladequota command

This command will generate a report showing your quota and usage information:

$ gladequota

Current GLADE space usage: csgteam

  Space                                       Used        Quota      % Full    # Files
----------------------------------------- ------------ ------------ --------- -----------
/glade/derecho/scratch/csgteam              111.03 TiB   400.00 TiB   27.76 %    10226778
/glade/work/csgteam                           1.40 TiB     4.00 TiB   35.02 %     9153458
/glade/u/home/csgteam                        68.27 GiB   100.00 GiB   68.27 %      237251
----------------------------------------- ------------ ------------ --------- -----------
/glade/collections/cmip                       1.26 PiB     2.93 PiB   42.95 %     3326866
/glade/p/cisl/CSG                             1.69 TiB     5.00 TiB   33.79 %     8168998
/glade/u/apps                                 2.02 TiB    10.00 TiB   20.19 %    20038473
----------------------------------------- ------------ ------------ --------- -----------
Campaign: csgteam (user total)                1.41 TiB          n/a       n/a     9891104
/glade/campaign/cisl/csg                      7.94 TiB    23.00 TiB   34.54 %        2859
/glade/campaign/cisl/sssg0001               217.15 TiB   250.00 TiB   86.86 %    50606617

gladequota and home directories

Output from the gladequota command will show the home space quota as 100 GB instead of 50 GB. This is because the system stores dual copies of users' data for increased data integrity and safety. In some circumstances, queries of storage utilization from du and ls will also report a duplicated data footprint in your home directory for the same reason.


General data retention policies and procedures

User data

User accounts are closed when they are no longer associated with an active project. When your account is closed, several steps in the process affect the accessibility of your data:

  • 30 days after your account is closed, a final home directory backup is performed and the home directory is deleted. The home backup is retained for six months, after which it is deleted.
  • 30 days after your account is closed, your work space is deleted. No backup is performed.
  • Finally, your scratch files are deleted as they reach the end of current retention period.

CISL has no ability to restore user data after the purge or removal policies stated above have taken effect.

Campaign Storage (Project) data

The duration of a project’s Campaign Storage space is controlled by NSF NCAR allocation policies. Specifically, the following policies apply to Campaign Storage allocations:

  • Campaign Storage is not intended to provide long-term or permanent preservation of a project’s data. For university users, long-term preservation of project data remains the responsibility of your home institution or the lead institution on the funded NSF award. Allocation requests should describe the PI’s plan for dealing with Campaign Storage data after the NSF award expires.
  • Campaign Storage quotas will not be expanded to allow data from an earlier, expired project to be retained under a new NSF award. That is, researchers may not use a new NSF award to preserve data from an earlier award. Campaign Storage space requests must be justified in terms of the needs of the work associated with the active NSF award.
  • Campaign Storage allocations (as with computing allocations) can remain active for the duration of the project’s funded NSF award, including any no-cost extensions.
  • With suitable justification, CISL may be able to extend a Campaign Storage allocation for an additional 12 months beyond the end of the NSF award. Suitable justification may include papers or other works proceeding through the publication review process. CISL reserves the right to decline these extra extensions or reduce Campaign Storage quotas after the end of the NSF award in order to ensure that disk space remains available to the community of researchers with active NSF awards.
  • Once a project has exhausted all policy options and ultimately expires, files are removed and the Campaign Storage space is reclaimed according to the following data retention policy.

When an allocated project approaches and reaches its expiration date, the following process is used to delete that project’s associated Campaign Storage data:

  • 30 days before project expiration, the project lead and admin will receive an email reminding them of the project’s pending expiration. The project team should assess remaining files and disposition appropriately in preparation for the end of the project.
  • 90 days after project expiration, project directories are emptied and scheduled for deletion. At this point, any users with accounts remaining on the system will no longer have access to the projects' data. It is therefore imperative that any remaining project data be relocated during this 90 day post-expiry access window.
  • Finally, files are removed as scheduled on the timeline described above for the relevant file system.

Restoring data access

Expired project data

There is no expectation of file access after the 90 day post-expiry access window.

Collaborators' data

A typical request for data access comes not from the departing user, but from remaining collaborators. Colleagues occasionally request access to a departed user’s files, sometimes many months after the account is terminated, when they realize the original owner set permissions that limit their access.

While CISL has a limited ability to help in these situations, there are also legal limits to what we can do. For example, CISL cannot share files beyond the clear intent of the original owner as inferred from the UNIX file permissions.

Project files

Files in a project’s Campaign Storage space are deemed owned by the project, and we can help update permissions as needed to make them visible to the group. Such modifications require the approval of the project lead or project admin.

Users' files

Files in the home, work, or scratch spaces are owned by the user. If a collaborator would like access to a file that was previously group- or world-readable, we may be able to help. If the original file was restricted to user-only read, however, we cannot override those intentions without explicit permission from the original owner. The only exceptions to this policy are in compliance with broader UCAR IT records or investigation policies as described in UCAR's 1-7 Information Security Policy.