An individual human pinna geometry can be used to achieve plausible personalized audio reproduction. However, an accurate acquisition of the pinna geometry typically requires the use of specialized equipment and often involves time-consuming post-processing to remove potential artifacts. To obtain an artifact-free but individualized mesh, a parametric pinna model based on cubic Bézier curves (BezierPPM) can be used to represent an individual pinna. However, the parameters need to be manually tuned to the acquired listener’s geometry. For increased scalability, we propose Mesh2PPM, a framework for an automatic estimation of BezierPPM parameters from an individual pinna. Mesh2PPM relies on a deep neural network (DNN) that was trained on a dataset of synthetic multi-view images rendered from BezierPPM instances. For the evaluation, unseen BezierPPM instances were presented to Mesh2PPM which inferred the BezierPPM parameters. We subsequently assessed the geometric errors between the meshes obtained from the BezierPPM parametrized with the inferred parameters and the actual pinna meshes. We investigated the effects of the camera-grid type, jittered camera positions, and additional depth information in images on the estimation quality. While depth information had no effect, the camera-grid type and the jittered camera positions both had effects. A camera grid of 3×3 provided the best estimation quality, yielding Pompeiu-Hausdorff distances of 2.05 ± 0.4 mm and 1.4 ± 0.3 mm with and without jittered camera positions, respectively, and root-mean-square (RMS) distances of 0.92 ± 0.12 mm and 0.52 ± 0.07 mm. These results motivate further improvements of the proposed framework to be ultimately applicable for an automatic estimation of pinna geometries obtained from actual listeners.