# Overview of Jackknife Method

#### Hincal Topcuoglu,Emre Ridvan Muratlar

#### May 12, 2019

# General Definition:

Jackknife method is applied to reduce the bias and to prove that interested estimation parameter is an unbiased true estimator of population parameter samely interested.

For example, you have a sample from dataset and you wish to estimate population mean.Suppose that you know the population mean also. With Jackknife method taking n-1 sample sized groups each time and calculating their mean and at the overall take the mean of all these groups mean.By this if you extract grouped mean from population mean you find bias amount of your estimation.If extraction is zero then you find the unbiased estimator which is gives true information about your population.

if it is not zero, then you apply bias correction formula to find unbiased estimatior regard to your population.

it is also applicable to variance or skewness estimation..

# General Formulas of Method Are:

# There is also Wiki Definition:

JackKnife method is a resampling method,useful for variance and bias estimation. This estimation is done by leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. So JackKnife estimate is found by aggegating the estimates of each (n-1)-sized sub sample.

The jackknife technique was developed by Maurice Quenouille (1924-1973) from 1949, and refined in 1956. John Tukey expanded on the technique in 1958. The jackknife is a linear approximation of the bootstrap“[1]”.

# Historical Working on JackKnife Method:

Quenouille(1949) while investigating the sampling methods between random,stratified random,systematic sampling with starting point W.G. Cochran“[2]”’s workings, and he focused on Yates(1946) problem about the difficulties attached to the estimation of error for a systemmatic sample.

Quenouille says in his paper that for random,stratified random and systematic sampling, if n is large enough,there should be a method created which split data into several equal parts to obtain an estimate of error of mean of each part and to combine these to obtain a more accurate estimate of the error of the overall mean..For stratified random sampling,he says that we may combine our estimates of error from each strata.This leads us to the commonly used method of taking q randomly chosen elements per strata,and combining the sets of variances of q-1 degrees of freedom to form an estimate of error.If we make our samples exclusive (no two elements coincide),this variance gives the estimated variance of the sample mean.

He especially gives the formulation using stratified random sampling method to estimate the error by which every time holding an element of the finitie sampled variable and then calculate mean, variance or correlation for each stratum and finally combining them to reach final estimated error“[3]”.

Quenouille in his paper(1956)“[4]”, continue to investigation about some bias correction approaches which is presented before, but he finds in his work that with taking n-1 observations from a dataset and taking their average every time then combining them by taking average of all of them gives more accurate results about correction of bias.

Using Weatherburn(1952) and Kendall’s(1943) gamma calculation for standart error technique,he changed this calculation with taking the average on n-1 instead values itself and JackKnife method’s base formula is founded by himself especially in variance and standart error calculation. he also made experiments with different distributed datasets and found nearly same results.“[4]”.

Tukey in his paper(1957) confirmed this method which reduces bias,but he also said this method is more appropiate for non-linear situations, or in many different situations(e.g independent and identically distributed random variables), and can be used to find approximate confidence limits on the desired estimation. He created the name Jackknife“[5]”.

In his paper Miller(1974) made a general review of Jackknife method with works done to that date and also tells about different cases which Jackknife method can be applied“[6]”.

Efron(1980,1981) mentioned about the collection of ideas concerning the non-parametric estimation of bias,variance and measures of error. and also gave good examples and explanations in his papers“[7]”,“[8]”.

Avery McIntosh in his paper also gives good explanation about theorem“[9]”.

# Lets look at an example

Using the formulation and data from the book of Tattar“[10]”, lets find if sample mean is unbiased true estimator of population mean under the assumption of population mean known:

```
library(ACSWR)
library(psych)
# attach data
data(nerve)
# check general statistics of data
des<-describe(nerve)
des[14]<-var(nerve)
colnames(des)[colnames(des)=="V14"] <- "Variance"
print(t(des),digits=8)
```

```
## X1
## vars 1.0000000000
## n 799.0000000000
## mean 0.2185732165
## sd 0.2091885907
## median 0.1500000000
## trimmed 0.1837597504
## mad 0.1482600000
## min 0.0100000000
## max 1.3800000000
## range 1.3700000000
## skew 1.7579431046
## kurtosis 3.8976149190
## se 0.0074005603
## Variance 0.0437598665
```

```
# calculate mean estimation using jackknife method
mean_of_each_sample<-NULL
est_var<-NULL
for(i in 1:length(nerve))
{ mean_of_data <- mean(nerve)
mean_of_each_sample[i] <- mean(nerve[-i])
estimated_mean <- mean(mean_of_each_sample)
estimated_bias <- (length(nerve)-1)*(estimated_mean-mean_of_data)
bias_corrected_jk_estimate <- mean_of_data - estimated_bias
}
if(bias_corrected_jk_estimate == mean_of_data)
{sprintf("Sample mean is the unbiased estimator of population mean.
Mean of data is %f and and Bias Corrected JK Estimation of Mean is %f"
,mean_of_data,bias_corrected_jk_estimate)}
```

`## [1] "Sample mean is the unbiased estimator of population mean.\n Mean of data is 0.218573 and and Bias Corrected JK Estimation of Mean is 0.218573"`

```
#lets calculate variance with JackKnife
mean_of_each_sample<-NULL
variance_of_each_sample<- NULL
bias_in_variance_of_es<-NULL
for(i in 1:length(nerve))
{
variance_of_data <- var(nerve)
mean_of_each_sample[i] <- mean(nerve[-i])
variance_of_each_sample[i] <- sum((nerve[-i]-mean_of_each_sample[i])^2)/(length(nerve)-1)#calculation of variance of each sample
bias_in_variance_of_es[i] <- (variance_of_each_sample[i]-variance_of_data) # bias in variance for each sample
bias_in_variance <- mean(bias_in_variance_of_es) # overall bias in variance
estimated_jk_variance <-(variance_of_data-bias_in_variance) # Jk Corrected Variance
}
sprintf("Variance of Data is %f and Jk_Corrected Variance is %f",variance_of_data,estimated_jk_variance)
```

`## [1] "Variance of Data is 0.043760 and Jk_Corrected Variance is 0.043815"`

# References:

[1]- WikiPedia - Jackknife Resampling - http://www.wikizero.biz/index.php?q=aHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvSmFja2tuaWZlX3Jlc2FtcGxpbmc

[2]- W.G. Cochran - The Relative accuracy of systematic and stratified random samples for a certain class of populations, 1946

[3]- Quenouille - Problems in Plane Sampling, 1949

[4]- Quenouille - Notes on Bias Estimation, 1956

[5]- Tukey - Bias and Confidence in Not-quite large samples, 1957

[6]- Rupert G. Miller - The jackknife-a review, 1974

[7]- Bradley Efron - The JackKnife,The Bootstrap and other resampling plans, 1980

[8]- Efron and Stein- The JackKnife estimate of varince, 1981

[9]- The Jackknife estimation method - Avery McIntosh

[10]- Tattar,Ramaiah,Manjunath- A Course in Statistics with R

Hi, this is a comment.

To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.

Commenter avatars come from Gravatar.

Fantastic beat ! I would like to apprentice while you amend your web site, how could i subscribe for a blog site? The account helped me a acceptable deal. I had been a little bit acquainted of this your broadcast offered bright clear concept