Probabilistic approaches for rigorous and efficient analysis of statistical properties of large datasets