Moving Data to/from SCG¶
Globus¶
When moving large amounts of data, either in size, large counts of small files or both, Globus offers an easy to use web interface, a personal connect client that can be installed on your laptop/desktop or on any system you have access to and provides easy access and security using your Stanford login. The Globus tools can be used to move, copy or sync data and will retry in the background on errors.
Almost all parts of SCG are accessible through Globus, using these collection names & IDs:
SCG Path | Name (UUID) |
Sharable? |
---|---|---|
~ | SRCC SCG Home (2e23906b-0608-45bb-b344-393b8706e862) |
No |
/labs | SRCC SCG Lab Storage (3257fc54-9071-42fa-88ca-6097b2679b9a) |
Yes |
/projects | SRCC SCG Project Storage (2a975852-c740-4ff0-b8df-bc66d4888fc9) |
Yes |
/public | SRCC SCG Public (9299a0f9-06db-4109-910b-d3b590be2440) |
Yes |
/reference | SRCC SCG Reference Data (670e9ef9-70c9-46df-b213-f878e36cfe72) |
No |
/storage | SRCC SCG Other Storage (87ab9eb7-a4cb-4bf1-811c-c2cebac2a695) |
Yes |
/gssc | Stanford GSSC Storage (ce40ee4c-bf06-4847-bd24-f4b41a6f2581) |
Yes |
/BaaS | SRCC SCG BaaS Storage (457ce307-ed67-4b07-803e-865ead9c1702) |
No |
(The old SCG Cluster Storage endpoint is now deprecated, and should not be used.)
Collection names above are links. Click the link to be taken to the Globus web app and load the collection. To transfer files, click on the button “Open in File Manager”. To make a share or view your existing shares, click on the “Collections” tab.
SCG OnDemand File App¶
The SCG OnDemand File App (https://ondemand.scg.stanford.edu/) offers an intuitive interface to navigate SCG storage and upload or download files. It also includes built in tools to view and edit files in the web browser.
Samba¶
The Samba server at samba.scg.stanford.edu presents SCG storage to Stanford campus networks and VPN and makes it possible to easily mount the storage as a shared drive on your local system. Basic instructions/troubleshooting for each major Operating System are below or you can try this direct link if you are feeling lucky and aren’t using Windows:
Linux¶
Open a terminal and run kinit SUNeTID@stanford.edu
replacing SUNetID with your SUNetID. For example,
griznog@gambusia:~$ kinit griznog@stanford.edu
Password for griznog@stanford.edu:
griznog@gambusia:~$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: griznog@stanford.edu
Valid starting Expires Service principal
07/15/2019 13:08:37 07/16/2019 14:08:27 krbtgt/stanford.edu@stanford.edu
renew until 07/22/2019 13:08:27
griznog@gambusia:~$
In Linux open Nautilus or your distribution’s file manager app and in the path
area or under a menu selection like Connect to server...
enter
smb://samba.scg.stanford.edu/
and you should be presented with a window displaying the available network
shares. The special share labeled with your SUNetID
is your SCG $HOME
directory.
Mac OS X¶
Open a terminal and run kinit SUNeTID@stanford.edu
replacing SUNetID with your SUNetID.
Open Finder, select Connect to server
and enter
smb://samba.scg.stanford.edu/
and you should be presented with a Finder window displaying the available network
shares. The special share labeled with your SUNetID
is your SCG $HOME
directory.
Windows¶
Open the File Explorer, and enter the following path in the “address bar”:
\\samba.scg.stanford.edu\
At the login prompt, use sunetid@stanford.edu
as the username, and use your
SUNet password as the password. In other words, if your SUNetID is abc
, use abc@stanford.edu
as your username, and use your SUNet password as your password.
If login still fails, you will need to do some one-time configuration, to tell your computer how to properly authenticate. Run these commands in the Command Prompt application, (which you will need to run as Administrator):
ksetup /addkdc stanford.edu krb5auth1.stanford.edu
ksetup /addkdc stanford.edu krb5auth2.stanford.edu
ksetup /addkdc stanford.edu krb5auth3.stanford.edu
ksetup /addhosttorealmmap .stanford.edu stanford.edu
The first three commands will give you a warning: “Your realm name stanford.edu has lowercase letters.” This is expected, and you should say “Yes” at the prompt.
Once those commands have run, you will need to restart your computer. After that, you should be able to log in.
rsync¶
rsync is a time-tested tool for moving data both between remote systems and locally. With a large number of options and features, it’s impossible to completely cover all potential uses of rsync, but we are able to show how we recommend using rsync with SCG.
Basic rsync usage is…
rsync [options] [user@host:]/path/to/source [user@host:]/path/to/target
Note that only one of source and target can be a remote host. An example of copying files from my local system to SCG (including our preferred options) is…
rsync -rltp --chmod Dg+s -v --partial --progress /mydrive/mydata griznog@login.scg.stanford.edu:/labs/ruthm/
The above example includes the following options:
-r
: Recurse into directories-l
: Copy symlinks as symlinks-t
: Preserve modification times-p
: Preserve permissions--chmod Dg+s
: Set the “setgid” bit on all directories. This is needed in SCG/labs
and/projects
storage, for permissions to work properly.-v
: Display a list of every file that is transferred; in general, be more verbose about what rclone is doing.--partial
: If the transfer fails, when you run again, try to reuse any partially copied files.--progress
: Show a progress bar with transfer speed for each file transferred. This is in addition to the verbosity of-v
./mydrive/mydata
: A source directory. Note that including a trailing /, e.g.,/home/giznog/mydata/
will cause rsync to work on the contents of the directory rather than the directory itself. This is a subtle difference that can lead to confusion on the target copy.griznog@login.scg.stanford.edu:/labs/ruthm/griznog/
: A target location. The trailing slash on a target has no significance.
Some other interesting and useful rsync options are:
--delete
: Useful when running rsync to update a remote copy where you want to delete any files that have been deleted on the local copy.--remove-source-files
: In cases where rsync is being used to quickly clean up data, for instance to reduce usage due to quota, this option will remove files once they have been successfully copied rather than having to wait until the entire rsync completes and deleting them manually. It does not remove directories.--dry-run
: Show what rsync would do, but don’t actually do any copy or removal. Useful to test with--delete
or--remove-source-files
before running a potentially destructive rsync command.
sftp¶
sftp
provides a secure/encrypted analogs to ftp
for any remote sites where ssh access is available. Example usage:
griznog@lepomis:~$ sftp griznog@login.scg.stanford.edu
Connected to login.scg.stanford.edu.
sftp> ls
Desktop Documents Downloads Logs Music Pictures Projects
Public Scratch Templates Videos Working bin myfile
ondemand rpmbuild
sftp> help
Available commands:
bye Quit sftp
cd path Change remote directory to 'path'
chgrp grp path Change group of file 'path' to 'grp'
chmod mode path Change permissions of file 'path' to 'mode'
chown own path Change owner of file 'path' to 'own'
df [-hi] [path] Display statistics for current directory or
filesystem containing 'path'
exit Quit sftp
get [-afPpRr] remote [local] Download file
reget [-fPpRr] remote [local] Resume download file
reput [-fPpRr] [local] remote Resume upload file
help Display this help text
lcd path Change local directory to 'path'
lls [ls-options [path]] Display local directory listing
lmkdir path Create local directory
ln [-s] oldpath newpath Link remote file (-s for symlink)
lpwd Print local working directory
ls [-1afhlnrSt] [path] Display remote directory listing
lumask umask Set local umask to 'umask'
mkdir path Create remote directory
progress Toggle display of progress meter
put [-afPpRr] local [remote] Upload file
pwd Display remote working directory
quit Quit sftp
rename oldpath newpath Rename remote file
rm path Delete remote file
rmdir path Remove remote directory
symlink oldpath newpath Symlink remote file
version Show SFTP version
!command Execute 'command' in local shell
! Escape to local shell
? Synonym for help
sftp>
scp¶
scp
provides a secure/encrypted analog to cp
which works with remote sources or targets. Example usage:
griznog@lepomis:~$ scp myfile griznog@login.scg.stanford.edu:
myfile 100% 0 0.0KB/s 00:00
Useful options are -r
for recursion (to copy directories) and -v
for verbose output.