A parameter that you choose which determines how the algorithm will perform.
In the case of machine learning in particular, it is not part of the training data set.
Hyperparameters can also be considered in domains outside of machine learning however, e.g. the step size in partial differential equation solver is entirely independent from the problem itself and could be considered a hyperparamter. One difference from machine learning however is that step size hyperparameters in numerical analysis are clearly better if smaller at a higher computational cost. In machine learning however, there is often an optimum somewhere, beyond which overfitting becomes excessive.
Philosophically, machine learning can be seen as a form of lossy compression.
And if we make it too lossless, then we are basically overfitting.