Miscellaneous Tips
This article will list miscellaneous tips for using Linux and Proteus & Picotte.
Linux
Text Editing
We suggest using the nano text editor. Online documentation is available.
Copying large amounts of data from one directory to another within a cluster
A directory named ResearchData
in your home directory is to be copied
completely to your subdirectory in your research group directory
/ifs/groups/myresearchGrp/myusername
rsync -avxWHAX --progress ~/ResearchData /ifs/groups/myresearchGrp/myusername
[1]
where:
-a : all files, with permissions, etc..
-v : verbose, mention files
-x : stay on one file system
-W : copy files whole (without delta-xfer algorithm) [NB do not use this for remote transfers]
-H : preserve hard links (not included with -a)
-A : preserve ACLs/permissions (not included with -a)
-X : preserve extended attributes (not included with -a) [NB do not use between Lustre and NFS, or ext4]
This creates the directory
/ifs/groups/myresearchGrp/myusername/ResearchData
CAUTION: Do not use the -W
option when transferring over the
network, e.g. from your lab computer to Proteus.
Expanding Archives (tarballs)
The tar
command will automatically decide what decompression algorithm
to use:
tar -xf source_package.tar.XXX
man pages
Linux online documentation is available at the command line with the
command "man
", short for manual. For instance, to see
documentation on the ls
command:
[juser@proteusa01 ~]$ man ls
LS(1) User Commands LS(1)
NAME
ls - list directory contents
SYNOPSIS
ls [OPTION]... [FILE]...
DESCRIPTION
List information about the FILEs (the current directory by default). Sort entries alphabetically if none of -cftuvSUX nor --sort.
...
The "man pages" are the usual name for the online documentation. Man
pages are divided into various numbered sections. "General commands" are
in section 1, e.g. see the ls
man page above. "C library functions"
are section 3, etc.[2][3] So, reference to specific man pages usually
include the section number, e.g. "please see ls(1)
". You may
optionally use the section number in the man
command, as there may be
duplicates across different sections:
[juser@proteusa01 ~]$ man 1 ls
Persistent Sessions
You may have persistent terminal sessions by using tmux.[4] These are sessions that allow you to log off, and then log back in again to continue where you left off.
Miscellaneous
Copying/Transferring files to/from a remote location from/to Picotte
- For bulk transfers, use rsync (in short,
rsync
source
destination
):[5]
rsync -avz /ifs/groups/myrsrchGrp/data myname@remote.location.com:DataDirectory
- All file transfer connections are via scp or sftp, or
rsync. These are all encrypted connections. There are various
graphical interfaces for scp and sftp. Please see the OS-specific
tips articles for details -- Tips for Windows Users and Tips for macOS Users:
- sftp via the OpenSSH feature in Windows 10
- ~~FileZilla~~ DO NOT USE FileZilla
- WinSCP - for Windows.
- On Linux, the SFTP client is typically built into the file
browser (Nautilus, Konqueror, Files). In a file browser window,
select the "Go" menu, and enter:
sftp://picottelogin.urcf.drexel.edu/home/YOURNAMEHERE
and you should be prompted for login info. - CyberDuck for Mac OS X.
- For transferring large amounts of data, rsync is the best option as it allows the transfer to continue if it interrupted without re-transferring data.
- Mac OS X comes with many command line utilities like Linux, so a
graphical interface is not strictly necessary. scp, sftp,
and rsync are available at the command line using the Terminal
application.
- Please do not use graphical applications such as Fetch.
- For synchronizing directories, use rsync. Alternatively, create a ZIP file or a tar file from the entire structure, before uploading.
- ~~In using rsync with XSEDE sites, HPN-SSH may be used for faster throughput. See documentation on the PSC Data Supercell.~~ HPN-SSH has been merged into OpenSSH, the default SSH implementation on Picotte.
- Please do NOT use "parallel" transfers, or run more than one file transfer at a time.
Compressing Files
- The
lbzip2
(and the corresponding uncompressing utilitylbunzip2
) provides multithreaded compression/decompression for bzip2 (.bz2) format.- Please run such multithreaded compress/decompress processes as
jobs, requesting an appropriate number of threads with the
shm
PE. By default, these utilities will use a number of threads equal to the number of CPU cores installed.
- Please run such multithreaded compress/decompress processes as
jobs, requesting an appropriate number of threads with the
- The parallel gzip is called
pigz
handling gzip (.gz) format
References
[2] StackExchange: What do the numbers in a man page mean?
[3] Wikipedia:Man_page#Manual_sections
[4] A tmux crash course, blog post by Josh Clayton
[5] Digital Ocean Tutorial: How to Use rsync to sync local and remote directories