Skip to contents

Computes and prints summary descriptive statistics for each variable in a list or data frame, including counts, numeric summaries (min, quartiles, mean, max, standard deviation), and factor summaries (levels and frequencies).

Usage

stat_desc(
  data,
  transpose = FALSE,
  pad = 2,
  opts = list(digits = 4, scipen = 8)
)

Arguments

data

A vector, list or data frame containing variables to summarize. A vector is treated as a single variable data frame. Unnamed variables receive generic names like V1, V2, etc.

transpose

A logical concerning report format. By default the summary printed and returned is organized to show variables in columns and their statistic values in rows. Setting transpose = TRUE generates a transposed report with variables in rows and statistics in columns.

pad

A positive integer for the number of spaces between output columns..

opts

A key=value tupe list, optional input for "options" values on output. Existing values are restored on exit.

Value

Invisibly returns a list of summary tables for counts, numeric, and factor variables.

Details

For each variable in data, the function computes the count of non-missing and missing values. Numeric variables are summarized by minimum, first quartile, median, mean, third quartile, maximum, and standard deviation. Factor and character variables are summarized by level frequencies. Results are formatted in a table and printed. The function returns a list containing:

  • cnt: Counts for each variable (n.val, n.na)

  • num: Numeric summaries for numeric variables

  • fctr: Summaries for factor variables

  • rpt: The formatted summary table (printed)

Examples

stat_desc(mtcars)
#>            mpg    cyl   disp     hp    drat      wt   qsec      vs      am
#> n.val       32     32     32     32      32      32     32      32      32
#> n.na         0      0      0      0       0       0      0       0       0
#> min       10.4      4   71.1     52    2.76   1.513   14.5       0       0
#> Q1        15.2      4  120.7     95    3.08   2.542  16.88      NA      NA
#> median    19.2      6  196.3    123   3.695   3.325  17.71       0       0
#> mean     20.09  6.188  230.7  146.7   3.597   3.217  17.85  0.4375  0.4062
#> Q3        22.8      8    334    180    3.92    3.65   18.9       1       1
#> max       33.9      8    472    335    4.93   5.424   22.9       1       1
#> std.dev  6.027  1.786  123.9  68.56  0.5347  0.9785  1.787   0.504   0.499
#>            gear   carb
#> n.val        32     32
#> n.na          0      0
#> min           3      1
#> Q1            3      1
#> median        4      2
#> mean      3.688  2.812
#> Q3            5      4
#> max           5      8
#> std.dev  0.7378  1.615
stat_desc(data.frame(a = rnorm(100), b = sample(letters[1:3], 100, TRUE)))
#>                 a         b
#> n.val         100       100
#> n.na            0         0
#> min        -1.914  n.lvls=4
#> Q1        -0.6432  a    :39
#> median    -0.1679  c    :35
#> mean     0.009043  b    :26
#> Q3         0.6221          
#> max         2.308          
#> std.dev    0.9815