Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6790061
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T17:40:49+00:00 2026-05-26T17:40:49+00:00

In some code I have converted to SSE I preform some ray tracing, tracing

  • 0

In some code I have converted to SSE I preform some ray tracing, tracing 4 rays at a time using __m128 data types.

In the method where I determine which objects are hit first, I loop through all objects, test for intersection and create a mask representing which rays had an intersection earlier than previously found .

I also need to maintain data on the id of the objects which correspond to the best hit times. I do this by maintaining a __m128 data type called objectNo and I use the mask determined from the intersection times to update objectNo as follows:

objectNo = _mm_blendv_ps(objectNo,_mm_set1_ps((float)pobj->getID()),mask);

Where pobj->getID() will return an integer representing the id of the current object. Making this cast and using the blend seemed to be the most efficient way of updating the objectNo for all 4 rays.

After all intersections are tested I try to extract the objectNo’s individually and use them to access an array to register the intersection. Most commonly I have tried this:

int o0 = _mm_extract_ps(objectNo, 0);
prv_noHits[o0]++;

However this crashes with EXC_BAD_ACCESS as extracting a float with value 1.0 converts to an int of value 1065353216.

How do I correctly unpack the __m128 into ints which can be used to index an array?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T17:40:50+00:00Added an answer on May 26, 2026 at 5:40 pm

    There are two SSE2 conversion intrinsics which seem to do what you want:

    • _mm_cvtps_epi32()
    • _mm_cvttps_epi32()

    http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011/compiler_c/intref_cls/common/intref_sse2_int_conversion.htm

    These will convert 4 single-precision FP to 4 32-bit integers. The first one does it with rounding. The second one uses truncation.

    So they can be used like this:

    int o0 = _mm_extract_epi32(_mm_cvtps_epi32(objectNo), 0);
    prv_noHits[o0]++;
    

    EDIT : Based on what you’re trying to do, I feel this can be better optimized as follows:

    __m128i ids = _mm_set1_epi32(pobj->getID());
    
    //  The mask will need to change
    objectNo = _mm_blend_epi16(objectNo,ids,mask);
    
    int o0 = _mm_extract_epi32(objectNo, 0);
    prv_noHits[o0]++;
    

    This version gets rid of the unnecessary conversions. But you will need to use a different mask vector.

    EDIT 2: Here’s a way so that you won’t have to change your mask:

    __m128 ids = _mm_castsi128_ps(_mm_set1_epi32(pobj->getID()));
    
    objectNo = _mm_blendv_ps(objectNo,ids,mask);
    
    int o0 = _mm_extract_ps(objectNo, 0);
    prv_noHits[o0]++;
    

    Note that the _mm_castsi128_ps() intrinsic doesn’t map any instruction. It’s just a bit-wise datatype conversion from __m128i to __m128 to get around the “typeness” in C/C++.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Suppose I have some code: let listB = [ 1; 2; 3 ] Using
Hey all, i have converted some C# PayPal API Code over to VB.net. I
I'm using a Google API to return some JSON, which i have converted to
I have got some javascript code and I'd like to convert this to C#.
I have some c(++) code that uses sprintf to convert a uint_64 to a
I am debugging some code and have encountered the following SQL query (simplified version):
I am trying to setup a few fixes in some code that have caught
Basically I have some code to check a specific directory to see if an
Suppose I have some code that would, in theory, compile against any version of
I have have some code which adds new cells to a table and fills

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.