From Blender to OpenCV Camera and back

In case you want to employ Blender for Computer Vision like e.g. for generating synthetic data, you will need to map the parameters of a calibrated camera to Blender as well as mapping the blender camera parameters to the ones of a calibrated camera.

Calibrated cameras typically base around the pinhole camera model which at its core is the camera matrix and the image size in pixels:

$K = \begin{bmatrix}f_x & 0 & c_x \\ 0 & f_y& c_y \\ 0 & 0 & 1 \end{bmatrix}, (w, h)$

But if we look at the Blender Camera, we find lots non-standard and duplicate parameters with random or without any units, like

unitless shift_x
duplicate angle, angle_x, angle_y, lens

Doing some research on their meaning and fixing various bugs in the proposed conversion formula, I could however come up with the following python code to do the conversion from blender to OpenCV

# get the relevant data
cam = bpy.data.objects["cameraName"].data
scene = bpy.context.scene
# assume image is not scaled
assert scene.render.resolution_percentage == 100
# assume angles describe the horizontal field of view
assert cam.sensor_fit != 'VERTICAL'

f_in_mm = cam.lens
sensor_width_in_mm = cam.sensor_width

w = scene.render.resolution_x
h = scene.render.resolution_y

pixel_aspect = scene.render.pixel_aspect_y / scene.render.pixel_aspect_x

f_x = f_in_mm / sensor_width_in_mm * w
f_y = f_x * pixel_aspect

# yes, shift_x is inverted. WTF blender?
c_x = w * (0.5 - cam.shift_x)
# and shift_y is still a percentage of width..
c_y = h * 0.5 + w * cam.shift_y

K = [[f_x, 0, c_x],
     [0, f_y, c_y],
     [0,   0,   1]]

So to summarize the above code

Note that f_x/ f_y encodes the pixel aspect ratio and not the image aspect ratio w/ h.
Blender enforces identical sensor and image aspect ratio. Therefore we do not have to consider it explicitly. Non square pixels are instead handled via pixel_aspect_x/ pixel_aspect_y.
We left out the skew factor s (non rectangular pixels) because neither OpenCV nor Blender support it.
Blender allows us to scale the output, resulting in a different resolution, but this can be easily handled post-projection. So we explicitly do not handle that.
Blender has the peculiarity of converting the focal length to either horizontal or vertical field of view (sensor_fit). Going the vertical branch is left as an exercise to the reader.

The reverse transform can now be derived trivially as

cam.shift_x = -(c_x / w - 0.5)
cam.shift_y = (c_y - 0.5 * h) / w

cam.lens = f_x / w * sensor_width_in_mm

pixel_aspect = f_y / f_x
scene.render.pixel_aspect_x = 1.0
scene.render.pixel_aspect_y = pixel_aspect