Stata
Stata is data analysis and statistical software1.
Installed Versions
Stata 19 and Stata 18 are installed. We have a 48-core (since there are 48-cores per node on Picotte), ten seat license.
You can use Stata by running:
module load stata/mp48/19
or
module load stata/mp48/18
Depending which version you want to use.
Documentation
PDF documentation is available on Picotte. They are in the directory ${STATADIR}/docs/
available once the modulefile is loaded.
You can also use Stata's online documentation
Personal Setup
Your own ADO files go in ~/ado/personal/
2.
Running
Command Line Version
To run the command line version:
[juser@picotte001 ~]$ module load stata/mp48/19
[juser@picotte001 ~]$ stata
___ ____ ____ ____ ____ ®
/__ / ____/ / ____/ 18.0
___/ / /___/ / /___/ BE—Basic Edition
Statistics and Data Science Copyright 1985-2023 StataCorp LLC
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC https://www.stata.com
979-696-4600 stata@stata.com
Stata license: 10-user network, expiring 24 Feb 2026
Serial number: 501809306532
Licensed to: Drexel University
Philadelphia
Notes:
1. Unicode is supported; see help unicode_advice.
.
Graphical User Interface (GUI) Version
To run the GUI version, you must have an X11 server installed on your computer. See:
The command to use is xstata.
Note that if you do this, you will be running Stata on the login node, which is a shared resource where multiple people may be logged in and doing work simultaneously.
Submitting Jobs
For long-running computations, you will want to write a job script to be submitted as a job on the cluster.
Please see more detail in Writing Slurm Job Scripts
Requesting License
Each job should request one license. The license on Picotte is limited to no more than 10 simultaneous uses.
#SBATCH --licenses=stata48:1
Stata/MP
Stata/MP, which runs multithreaded, is
also available. It is provided by the command stata-mp
. There are two
editions of Stata/MP on Picotte: one licensed for 4 cores (unlimited
number of seats), and one licensed for 48 cores (limit of ten seats).
N.B. more cores (threads) does not guarantee better performance. It can frequently be the reverse.
NOTE: you must do "set processors NN
" in your .do
file to be the
exact number of slots requested by the job. The number of slots (each
slot is one processor core) is requested by the lines:
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=48
#SBATCH --license=stata48:1
module load stata/mp48/19
In this example, the number of CPU cores is 48.
You may also read the environment variable SLURM_CPUS_PER_TASK
to use
in the "set processors NN
" command in your Stata .do
file:
local p : env SLURM_CPUS_PER_TASK
set processors $p
SLURM_CPUS_PER_TASK
is set by Slurm, the job scheduler, to be the
value requested by "--cpus-per-task".
NOTE ON PERFORMANCE: More does not necessarily mean faster. Some functions/routines may be parallelizable, others may not be. You will need to benchmark your specific computation to find the optimal number of CPU cores to use in the computation.
Example Job for Picotte (Slurm)
This is the Stata script to be run -- named testing.do
:
// test computation - testing.do
clear*
set rmsg on
set obs 100000
local p : env SLURM_CPUS_PER_TASK
set processors $p
forval n = 1/5 {
g i`n' = runiform()
}
g dv = rbinomial(1,.3)
memory
qui logit dv i*
qui xtmixed dv i*
*with bootstrap:
qui bs, reps(2000): logit dv i*
This is the job script -- named teststata.sh
:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=128G
#SBATCH --account=urcfadmprj
#SBATCH --time=4:00:00
#SBATCH --license=stata48:1
module load stata/mp48/19
# set the Stata temporary directory to the job-specific temporary directory
# this directory will be automatically deleted at the end of the job
export STATATMP=$TMP
stata-mp -b do testing.do
To submit the job:
[juser@picotte001]$ sbatch teststata.sh
NB you may see a warning/error message in the Slurm output file; this can be safely ignored.
stata-mp: /lib64/libtinfo.so.5: no version information available (required by stata-mp)
Outputs
Stata, by default, produces a log file named after the .do
file. So, running
the Stata DO script something.do
produces the log something.log
If the same DO script is run multiple times, later runs will overwrite the log from earlier runs.
See Also
- Stata Support and Online Resources
- If you are interested in converting from Stata to R: http://dss.princeton.edu/training/ in particular this PDF http://dss.princeton.edu/training/RStata.pdf
- R vs. Stata benchmark