# Opinions on Fb. Part #7. Mostly on statistics applications

Posted on: December 10, 2022

“Scientific racism”?

Pseudo-scientific racism (because racism is not scientific), was a way in which European colonial governments — and the statisticians they hired to do government survey, data collection, and interpretation of the data— justified their racist policies by using statistical measurement methods, often in a extremely biased and incorrect way.

Using scientific terminology and measurements to back their results was a way to sooth their guilty conscience.

An underlying theme of statistics is that subjectivity is everywhere, in almost every step in the interpretation of data.

Since the Data (collected measurements) almost never speak for themselves, it has to be interpreted in the surrounding context.

And the context of how measurement were done is of utmost importance.

The statisticians actually wants to know the distribution function of a set of data, but go ahead of trying to fit data to their wished hypothesis or Null Hypothesis, and thus use the flawed statistical equations.

The p-value of a hypothesis test is the probability of your test statistic taking the observed value or more extreme, assuming your null hypothesis is true.

In mathematical notation, this would be

The vertical line means “given H0”, which means “assuming the null hypothesis is true”. The value x would be, say a “z-score”, or a “t-statistic”, or a “chi-squared statistic”— if those are familiar words from your statistics classes.

One of the most common mistakes is assuming that the p-value tells you the probability of your null hypothesis given the data-evidence. This is wrong.

The mathematical notation for this would be:

One of these alternatives is called “Bayes factors”.

The Bayes factor helps compare whether data under the null hypothesis is more likely or not than the alternative one.

It basically tells us how much the data (evidence) support the alternative over the null, which can be very useful.

Bayes probability takes the context into consideration by adopting a prior odds in the data:

Mathematically, it’s written like this: Image credit: Author. Equation 1 is the Bayes Factor, written as “K”. Eq. 2 is Bayesian inference, written as odds.