Overview ======== High Performance Analytics Toolkit (HPAT) is a big data analytics and machine learning framework that provides Python's ease of use but is extremely fast. HPAT scales analytics programs in python to cluster/cloud environments automatically, requiring only minimal code changes. Here is a logistic regression program using HPAT:: @hpat.jit def logistic_regression(iterations): f = h5py.File("lr.hdf5", "r") X = f['points'][:] Y = f['responses'][:] D = X.shape[1] w = np.random.ranf(D) t1 = time.time() for i in range(iterations): z = ((1.0 / (1.0 + np.exp(-Y * np.dot(X, w))) - 1.0) * Y) w -= np.dot(z, X) return w This code runs on cluster and cloud environments using a simple command like `mpiexec -n 1024 python logistic_regression.py`. HPAT compiles a :ref:`subset of Python ` to efficient native parallel code (with `MPI `_). This is in contrast to other frameworks such as Apache Spark which are master-executor libraries. Hence, HPAT is typically 100x or more faster. HPAT is built on top of `Numba `_ and `LLVM `_ compilers.