banner background

3D head reconstruction solution

The client has a production of souvenirs that represent a translucent cube with a custom 3D sculpture of a human head. The global goal of the customer was to reduce human labor and the time of 3D designers’ work.

Project background

One of the most interesting tasks of AI is reconstructing faces or entire heads as 3D models. This task is especially important for creative design agencies that provide branding, advertising, graphic design, and virtual modeling services. 

Our client has a production of souvenirs that represent a translucent cube with a custom 3D sculpture of a human head. The agency has 100+ in-house 3D designers to process every image manually.  In the past, the flow used to involve designers processing images from clients. The pictures were of poor quality, from social networks or family albums. 

Then designers took stock prop of heads, hands, torsos, etc., and sculpted a 3D model that suited the sent image as much as possible. The image was overlaid on top of the created model as a texture. Every image took 15 minutes to process.  

The main goal of our customer was to reduce human labor and the time spent by 3D designers.

  • Duration: June 2020 – August 2020
  • Location: Germany
  • Industry: Design
  • Services:
  • Product discovery, POC development

Business needs

The customer’s agency had 100+ in-house 3D designers to process every image. In the past, the flow used to involve designers processing images from clients. The pictures could be of poor quality, from social networks or family albums. 

Then designers took stock prop of heads, hands, torsos, etc., and sculpted a 3D model that suited the sent image as much as possible. The image was overlaid on top of the created model as a texture. Every image took 15 minutes to process.  

– Help designers save time on creating 3D models of the person’s head, i.e. programmatically generate a 3D model using one photographic shot.   

– The 3D head model includes a face, skull shape, ear shape, teeth, tongue, and an emotional expression, without hairline, beard, eyebrows, or mustache.   

– The client was looking for a fast solution, within a few months. 

– We got a task to research if per-pixel accuracy was possible.

Product features

  1. 2D photo processing. Take any random 2D photo shot as an input
  2. Face detection. Detect the face with RetinaFace deep face detection library
  3. Estimation of rotation. Estimate the rotation with FSA-Net neural network
  4. 2D landmarks search. Find 2D landmarks with FAN2D network and estimate the scale
  5. Estimation of PCA coefficient. Estimate PCA coefficient for Nvidia ICT-FaceKit head models
  6. Full 3D object creation. Create a full 3D object that looks similar to the original face on the photo and is properly scaled and positioned

Solution

We did not take data from designers because they considered the visible side of the head only, thus their 3D models were not symmetrical and had a non-constant number of vertices.   

We had to use something other than the state-of-the-art solutions like 2D to 3D neural networks because: 

– we were missing a proper dataset,  

– the neural network had a huge size and an enormously long training time,  

– even the state-of-the-art solution had quite low-quality results for the provided examples.  

3D face reconstruction  

Step 1 

– Retrain the FAN2D network;

– Fix the heatmap generation; 

– Annotate cheeks landmarks on the 3D head;

– Generate several samples of synthetic 3D heads with different orientations, genders, and skin colors (textures);

– Final fine-tuning of the existing FAN2D model. 

Step 2

Principal Component Analysis (PCA) is the most widely used tool in exploratory data analysis and in Machine Learning (ML) for dimensionality reduction. In our case, one PCA component is a set of vectors per vertex. The factor/scalar for one component is the same for all vectors belonging to this component.  

Hence, if we found a proper factor for one vertex of the component, we found it for all of the other vertices of the same component. 

It helped us to limit the vertices for landmarks only and significantly reduced computational cost. 

Step 3

The final 3D reconstruction includes the following:  

– Applies rotation to the reference 3D model;

– Estimates the 3D model scale by 2D landmarks;

– Constructs a loss function (distances between 2D landmarks and projected 3D landmarks);

– Uses SGD to find the best scalar for each PCA component and to make the final fit of 3D and 2D landmarks. 

Our technology stack

  • TensorFlow
  • PyTorch
  • Nvidia
  • MXNet
  • OpenCV
  • Python
  • Numpy

Client values

  1. We conducted profound research to find the best-suited solution to the client’s problems.
  2. Our developers offered an out-of-the-box face detection system that helps reduce time to process images and the time of designers’ work.
  3. We managed to do the research within estimated 2 months.
  4. We reduced the image processing time from 15 minutes to 10 minutes with high accuracy.

Employee testimonial

Testimonial_Bobniev
Roman Bobniev Technical Lead of CV Division

It was challenging and interesting to research and solve this 3D head reconstruction project, i.e., find the bugs in facial landmarks detection (heatmap generation) and add more landmarks for the cheek area. Glad we did our best to reduce image processing time to 10 minutes and find the best possible option for this case: Open-sourced Nvidia ICT-FaceKit with blend shapes. It’s a set of 153 PCA components where the first 100 PCA are human-specific variance, like race, gender, age, etc.; and 53 more PCA are emotion-specific variance; with texture mapping, and er-vertex landmark mapping.

Let’s bring your idea to
life together!

    Successfully applied!