In the field of 3D Human Performance Capture, a high quality 3D scan of the performer is rigged and skinned to an animatable 3D template mesh that is subsequently fitted to the captured performance's RGB-D data. Template fitting is accomplished via solving for the template's pose parameters that better explain the performance data at each recorded frame. In this paper, we challenge open implementations of zeroth-order optimizers to solve the template fitting problem in a human performance capture dataset. The objective function that we employ approximates, the otherwise costly to evaluate, 3D RMS hausdorff distance between the animated template and the 3D mesh reconstructed from the depth data (target mesh) at an individual recorded frame. We distinguish and benchmark the optimizers, in three different real-world scenarios, two of which are based on the geometric proximity of the template to the target in individual frames, while in the third one we fit the template sequentially to all target frames of the recorded sequence. Conclusions of this work can serve as a reference for future optimizer implementations and our findings can server as a baseline for future multi-objective optimization approaches. To facilitate this, we make part of our benchmark and experiment setup publicly available (https://github.com/VCL3D/nevergrad, https://github.com/VCL3D/PerformanceCapture/releases/).