Making Sense of Big Data

How to conduct Independence and K-sample testing with large datasets

Have you ever wondered about:

  • Whether two variables are dependent?
  • Whether two samples are from the same distribution?

If that’s the case, this post is for you. The article will explain the two methods and introduce the Hyppo library at the end.

Distance Correlation (Dcorr)

The theory was first published in the 2007…

Mingzhi Yu

Data scientist, AI engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store