I want to pack/unpack two signed 16 bit integers into a 32 bit integer. However, I’m not getting it to quite work.
Any ideas as to what I might be doing wrong?
template <typename T>
int read_s16(T& arr, int idx) restrict(amp)
{
return static_cast<int>((arr[idx/2] >> ((idx % 2) * 16)) << 16) >> 16;
}
template<typename T>
void write_s16(T& arr, int idx, int val) restrict(amp)
{
// NOTE: arr is zero initialized
concurrency::atomic_fetch_or(&arr[idx/2], (static_cast<unsigned int>(val) & 0xFFFF) << ((idx % 2) * 16));
}
The function return/arguments must be as I have defined. The lo and hi are written from different threads (thus the atomic_or), and the read must return a single 32 bit value.
16 bit integer arithmetics are not supported on the target platform.
Example:
array<int> ar(1); // Container
write_s16(ar, 0, -16);
write_s16(ar, 1, 5);
assert(read_s16(ar, 0) == -16);
assert(read_s16(ar, 1) == 5);
These atomic operations in C++ AMP also have the following limitations:
Normal reads may not see the results of atomic writes to the same
memory location. Normal writes should not be mixed with atomic writes
to the same memory location. If your program does not comply with
these criteria then this will lead to an undefined result.
operations do not imply a memory fence of any sort. Atomic operations
may be reor-dered. This differs from the behavior of interlocked
operations in C++.
It would seem like you are violating the first of these.