Mean, Standard Deviation and Variance

gsl_stats_mean(data)

This function returns the arithmetic mean of data, a dataset of length n with stride stride. The arithmetic mean, or sample mean, is denoted by \(\hat{\mu}\) and defined as,

\[\hat{\mu}= {1 \over N} \sum x_i\]

where \(x_i\) are the elements of the dataset data. For samples drawn from a gaussian distribution the variance of \(\hat{\mu}\) is \(\sigma^2 / $N\).

gsl_stats_variance(data)

This function returns the estimated, or sample, variance of data a dataset of length n. The estimated variance is denoted by \(\hat{\sigma^2}\) and is defined by,

\[{\hat{\sigma}}^2 = {1 \over (N-1)} \sum (x_i - {\hat{\mu}})^2\]

where \(x_i\) are the elements of the dataset data. Note that the normalization factor of \(1/(N-1)\) results from the derivation of \(\hat{\sigma}^2\) as an unbiased estimator of the population variance \(\sigma^2\). For samples drawn from a Gaussian distribution the variance of \(\hat{\sigma}^2\) itself is \(2 \sigma^4 / N\).

This function computes the mean via a call to gsl_stats_mean(). If you have already computed the mean then you can pass it directly to gsl_stats_variance_m().

gsl_stats_variance_m(data, mean)

This function returns the sample variance of data relative to the given value of mean. The function is computed with \(\hat{\mu}\) replaced by the value of mean that you supply,

\[{\hat{\sigma}}^2 = {1 \over (N-1)} \sum (x_i - mean)^2\]
gsl_stats_sd(data)
gsl_stats_sd_m(data, mean)

The standard deviation is defined as the square root of the variance. These functions return the square root of the corresponding variance functions above.

gsl_stats_tss(data)
gsl_stats_tss_m(data, mean)

These functions return the total sum of squares(TSS) of data about the mean.For gsl_stats_tss_m() the user - supplied value of mean is used, and for gsl_stats_tss() it is computed using gsl_stats_mean().

\[{\rm TSS} = \sum(x_i - mean) ^ 2\]
gsl_stats_variance_with_fixed_mean(data, mean)

This function computes an unbiased estimate of the variance of data when the population mean mean of the underlying distribution is known a priori .In this case the estimator for the variance uses the factor \(1/N\) and the sample mean \(\hat{\mu}\) is replaced by the known population mean \(\mu\),

\[{\hat{\sigma}} ^ 2 = { 1 \over N } \sum(x_i - \mu) ^ 2\]
gsl_stats_sd_with_fixed_mean(data, mean)

This function calculates the standard deviation of data for a fixed population mean mean. The result is the square root of the corresponding variance function.