Computes and prints summary descriptive statistics for each variable in a list or data frame, including counts, numeric summaries (min, quartiles, mean, max, standard deviation), and factor summaries (levels and frequencies).
Usage
stat_desc(
data,
transpose = FALSE,
pad = 2,
opts = list(digits = 4, scipen = 8)
)
Arguments
- data
A vector, list or data frame containing variables to summarize. A vector is treated as a single variable data frame. Unnamed variables receive generic names like
V1
,V2
, etc.- transpose
A logical concerning report format. By default the summary printed and returned is organized to show variables in columns and their statistic values in rows. Setting
transpose = TRUE
generates a transposed report with variables in rows and statistics in columns.- pad
A positive integer for the number of spaces between output columns..
- opts
A key=value tupe list, optional input for "
options
" values on output. Existing values are restored on exit.
Details
For each variable in data
, the function computes the count of non-missing and missing values. Numeric variables are summarized by minimum, first quartile, median, mean, third quartile, maximum, and standard deviation. Factor and character variables are summarized by level frequencies. Results are formatted in a table and printed. The function returns a list containing:
cnt
: Counts for each variable (n.val
,n.na
)num
: Numeric summaries for numeric variablesfctr
: Summaries for factor variablesrpt
: The formatted summary table (printed)
Examples
stat_desc(mtcars)
#> mpg cyl disp hp drat wt qsec vs am
#> n.val 32 32 32 32 32 32 32 32 32
#> n.na 0 0 0 0 0 0 0 0 0
#> min 10.4 4 71.1 52 2.76 1.513 14.5 0 0
#> Q1 15.2 4 120.7 95 3.08 2.542 16.88 NA NA
#> median 19.2 6 196.3 123 3.695 3.325 17.71 0 0
#> mean 20.09 6.188 230.7 146.7 3.597 3.217 17.85 0.4375 0.4062
#> Q3 22.8 8 334 180 3.92 3.65 18.9 1 1
#> max 33.9 8 472 335 4.93 5.424 22.9 1 1
#> std.dev 6.027 1.786 123.9 68.56 0.5347 0.9785 1.787 0.504 0.499
#> gear carb
#> n.val 32 32
#> n.na 0 0
#> min 3 1
#> Q1 3 1
#> median 4 2
#> mean 3.688 2.812
#> Q3 5 4
#> max 5 8
#> std.dev 0.7378 1.615
stat_desc(data.frame(a = rnorm(100), b = sample(letters[1:3], 100, TRUE)))
#> a b
#> n.val 100 100
#> n.na 0 0
#> min -1.914 n.lvls=4
#> Q1 -0.6432 a :39
#> median -0.1679 c :35
#> mean 0.009043 b :26
#> Q3 0.6221
#> max 2.308
#> std.dev 0.9815