In the scientific world, Randomness is a term with a slightly negative connotation. Doing things “randomly,” in everyday language, suggests improvisation, patching things together, not following a plan or logical reasoning. Imagine an analyst or scientist telling you they solved a problem using a random approach: well, you probably wouldn’t feel entirely reassured.
But in the world of statistics, things are quite different. Let’s be clear straight right away: statistics can, in a way, be defined as the science of randomness—especially its most famous branch: probability theory. However, this article isn’t about that thorny topic (raise your hand if you flinched reading “probability theory”), but rather something slightly more intriguing: machine learning, the branch of statistics that analyzes large datasets to make predictions about an uncertain outcome (the dependent variable) based on a set of predictors (also called independent variables).
Continue reading