Vincent Granville Articles and Books: Re-sampling: Amazing Results and Applications

Saturday, May 4, 2019

Re-sampling: Amazing Results and Applications

This crash course features a new fundamental statistics theorem -- even more important than the central limit theorem -- and a new set of statistical rules and recipes. We discuss concepts related to determining the optimum sample size, the optimum k in k-fold cross-validation, bootstrapping, new re-sampling techniques, simulations, tests of hypotheses, confidence intervals, and statistical inference using a unified, robust, simple approach with easy formulas, efficient algorithms and illustration on complex data.

Little statistical knowledge is required to understand and apply the methodology described here, yet it is more advanced, more general, and more applied than standard literature on the subject. The intended audience is beginners as well as professionals in any field faced with data challenges on a daily basis. This article presents statistical science in a different light, hopefully in a style more accessible, intuitive, and exciting than standard textbooks, and in a compact format yet covering a large chunk of the traditional statistical curriculum and beyond.

In particular, the concept of p-value is not explicitly included in this tutorial. Instead, following the new trend after the recent p-value debacle (addressed by the president of the American Statistical Association), it is replaced with a range of values computed on multiple sub-samples.

Our algorithms are suitable for inclusion in black-box systems, batch processing, and automated data science. Our technology is data-driven and model-free. Finally, our approach to this problem shows the contrast between the data science unified, bottom-up, and computationally-driven perspective, and the traditional top-down statistical analysis consisting of a collection of disparate results that emphasizes the theory.

Read the full article here.

Contents

1. Re-sampling and Statistical Inference

Main Result
Sampling with or without Replacement
Illustration
Optimum Sample Size
Optimum K in K-fold Cross-Validation
Confidence Intervals, Tests of Hypotheses

2. Generic, All-purposes Algorithm

Re-sampling Algorithm with Source Code
Alternative Algorithm
Using a Good Random Number Generator

3. Applications

A Challenging Data Set
Results and Excel Spreadsheet
A New Fundamental Statistics Theorem
Some Statistical Magic
How does this work?
Does this contradict entropy principles?

4. Conclusions

Vincent Granville Articles and Books

Saturday, May 4, 2019

Re-sampling: Amazing Results and Applications

No comments:

Post a Comment

Fuzzy Regression: A Generic, Model-free, Math-free Machine Learning Technique

Blog Archive