Saturday, October 26, 2019

More Weird Statistical Distributions

Some original and very interesting material is presented here, with possible applications in Fintech. No need for a PhD in math to understand this article: I tried to make the presentation as simple as possible, focusing on high-level results rather than technicalities. Yet, professional statisticians and mathematicians, even academic researchers, will find some deep and fascinating results worth further exploring.
Can you identify patterns in this chart? (see section 2.2. in the article for an answer)
Let's start with 
Here the X(k)'s are random variable identically and independently distributed, commonly referred to as X. We are trying to find the distribution of Z.
Contents
1. Using a Simple Discrete Distribution for X
2. Towards a Better Model
  • Approximate Solution
  • The Fractal, Brownian-like Error Term
3. Finding X and Z Using Characteristic Functions
  • Test with Log-normal Distribution for X
  • Playing with the Characteristic Functions
  • Generalization to Continued Fractions and Nested Cubic Roots
4. Exercises
Read this article here

Wednesday, October 2, 2019

Surprising Uses of Synthetic Random Data Sets

I have used synthetic data sets many times for simulation purposes, most recently in my articles Six degrees of Separations between any two Datasets and How to Lie with p-values. Many applications (including the data sets themselves) can be found in my books Applied Stochastic Processes and New Foundations of Statistical Science. For instance, these data sets can be used to benchmark some statistical tests of hypothesis (the null hypothesis known to be true or false in advance) and to assess the power of such tests or confidence intervals. In other cases, it is used to simulate clusters and test cluster detection / pattern detection algorithms, see here.  I also used such data sets to discover two new deep conjectures in number theory (see here), to design new Fintech models such as bounded Brownian motions, and find new families of statistical distributions (see here).
Goldbach's comet 
In this article, I focus on peculiar random data sets to prove -- heuristically -- two of the most famous math conjectures in number theory, related to prime numbers: the Twin Prime conjecture, and the Goldbach conjecture. The methodology is at the intersection of probability theory, experimental math, and probabilistic number theory. It involves working with infinite data sets, dwarfing any data set found in any business context.
Read full article here.

Fuzzy Regression: A Generic, Model-free, Math-free Machine Learning Technique

  A different way to do regression with prediction intervals. In Python and without math. No calculus, no matrix algebra, no statistical eng...