I’m extracting frames of video to a Surface array to be rearranged into a new video, trading the x dimension with time. Here are some examples of different kinds of effects that come out: http://www.youtube.com/view_play_list?p=B2540182DE868E85
The app always crashes with std::bad_alloc when I try to store 1280 frames of 1280×720 video (1,179,648,000 pixels) into the Surface[]. It doesn’t crash with 1280 frames of 1080×720 video (995,328,000 pixels).
I made a simple test that makes it work on my computer (4GB RAM), but not on a friend’s wimpier laptop:
maxWidth = 1920;
while ((inW * inH * maxWidth) >= 1000000000)
maxWidth -= 20;
Two questions:
- Is there a better way to have fast access to 10^9 pixels than a Surface array?
- What is this memory limit, and how can I test for it and avoid it when setting up the
maxWidthfor the output?
Big thanks from the C++ noob. I put the source on Github: Redimensionator. It uses Cinder.
First off, on a 32-bit platform, your hard limit for address space usage is going to be somewhere around 2GB (but possibly much less) – assuming you keep it all mapped at once. It’s best to assume you won’t be able to get more than maybe 512MB in contiguous memory, and 1-1.5GB or so in noncontiguous memory (ie, by making multiple small mappings). This is most likely the problem you have; you ran out of contiguous address space. The hardware in turn is limited (for intel CPUs) to somewhere around 16GB of memory for a 32-bit system. And you really, really don’t want to be swapping. So this means you have one of several options:
shm_openandmmap. (complex, almost as fast as the 64-bit one if done right. Minimize the number of remappings you do. Still needs a lot of memory)The first two options are good if you have enough memory to hold the entire output video. Ideally you’d want to go the 64-bit route; remapping shared memory windows is an expensive operation, and you’ll be doing it a lot.
With the fourth option, it can be difficult to know what the memory limit is. I would recommend doing a binary search using test allocations to figure out how much space in your address space you can use (you should be using low level allocation calls to avoid heap overhead, note). Note that if you’re not careful this might not leave any address space for your video decoder – it’d probably be best to subtract 100mb or so from the result and reallocate it to give some room for the normal heap. You should also be careful to stay well below total physical memory, to avoid hitting swap.
Without knowing your OS and what library you’re getting that
Surfaceclass from, it’s hard to be more specific about how to probe it – but you really should avoid keeping it in the normal heap, just to avoid allocation errors in other code that’s possibly not instrumented to deal with OOMs.As a side note, you may want to rotate the output frames 90 degrees while preparing them (that is, put them into column major order). You can then rotate them back as a final pass after constructing all the raw images (or even when encoding from raw image data to a compressed format). This is especially important if you decide to go with the disk route with a SSD – it will help avoid unnecessary reads and writes, as with row major order (the usual order for videos) you will have to skip over the pixels for other columns whenever you write one. With in-memory work, though, it’s still helpful, as it improves cache locality.