This app stems from my experience as a consultant. At some point in my career I realized that most people consider tuning Spark for optimal (or even error-free) execution a far from a trivial task or - to put it simply - absolutely boring. As a result, people do not really know how to set up the cluster/job or simply set variables higher (i.e. demanding more resources) hoping to get their jobs through.

To be fair, the complexity of Spark does not make it easy on anyone, but I was convinced that there had to be a logic and relatively simple way to set Spark variables.

In fact, people have already tackled the problem, e.g here and here. I wanted to have a more general way to calculate the best Spark parameters and I have described the process in this blog post .

In this little web app, I put the theory into practice. I hope it can be useful to you.