I discovered that CGBitmapContextCreateImage() creates an image, which is not neccessarily always a mask compatible with CGContextClipToMask(). But when using CGImageMaskCreate(), the CGImageRef is always a mask that works with CGContextClipToMask(). Now, what is so special about the mask VS. the “normal” image?
My guess is that the mask is grayscale only, where as an CGImageRef created with CGBitmapContextCreateImage() may have RGBA values which irritate CGContextClipToMask(). I couldn’t find the spot in the documentation where the exact difference between masks and CG images is explained.
But it seems that an Core Graphics image != a mask, while a mask == a Core Graphics Image
Every value in an image, be it RGB, CMYK or Greyscale, represents a position in a particular colorspace. It is meaningful to ask “What would this value be in colour-space ‘x’?” – and the result would, if possible, be the same colour, but could be a different numerical value.
eg (simplistically). A pixel with value (255,255,255) is White in an RGB colorspace but Black in a (hypothetical) CMY colour-space. Converting the White RGB pixel to the CMY colorspace would give the value (0,0,0). In other words an Image must have a colorspace, it only makes sense given a colorspace.
On the contrary, an 8bit mask represents absolute values between 0-255. There is no colorspace and it makes no sense to think of a mask in a particular colorspace.
In that way images and masks are fundamentally different, even though we often think of masks as greyscale images.