Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8026795
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T23:36:13+00:00 2026-06-04T23:36:13+00:00

I have the following piece of code which I’d like to optimize using Cython:

  • 0

I have the following piece of code which I’d like to optimize using Cython:

sim = numpy.dot(v1, v2) / (sqrt(numpy.dot(v1, v1)) * sqrt(numpy.dot(v2, v2))) 
dist = 1-sim
return dist

I have written and compiled the .pyx file and when I ran the code I do not see any significant improvement in performance. According to the Cython documentation I have to add c_types. The HTML file generated by Cython indicates that the bottleneck is the dot products (which is expected of course). Does this mean that I have to define a C function for the dot products? If yes how do I do that?

EDIT:

After some research I have come up with the following code. The improvement is only marginal. I am not sure if there is something I can do to improve it :

from __future__ import division
import numpy as np
import math as m
cimport numpy as np
cimport cython

cdef extern from "math.h":
    double c_sqrt "sqrt"(double)

ctypedef np.float reals #typedef_for easier readding

cdef inline double dot(np.ndarray[reals,ndim = 1] v1, np.ndarray[reals,ndim = 1] v2):
  cdef double result = 0
  cdef int i = 0
  cdef int length = v1.size
  cdef double el1 = 0
  cdef double el2 = 0
  for i in range(length):
    el1 = v1[i]
    el2 = v2[i]
    result += el1*el2
  return result

@cython.cdivision(True)
def distance(np.ndarray[reals,ndim = 1] ex1, np.ndarray[reals,ndim = 1] ex2):
  cdef double dot12 = dot(ex1, ex2)
  cdef double dot11 = dot(ex1, ex1)
  cdef double dot22 = dot(ex2, ex2)
  cdef double sim = dot12 / (c_sqrt(dot11 * dot22))
  cdef double dist = 1-sim    
  return dist 
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T23:36:14+00:00Added an answer on June 4, 2026 at 11:36 pm

    As a general note, if you are calling numpy functions from within cython and doing little else, you generally will see only marginal gains if any at all. You generally only get massive speed-ups if you are statically typing code that makes use of an explicit for loop at the python level (not in something that is calling the Numpy C-API already).

    You could try writing out the code for a dot product with all of the static typing of the counter, input numpy arrays, etc, with wraparound and boundscheck set to False, import the clib version of the sqrt function and then try to leverage the parallel for loop (prange) to make use of openmp.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have following piece of code which tries to load an XML file from
I have the following piece of code which works fine and does what it
I have the following piece of code which is not working the way I
I have the following piece of code which doesn't compile when I try to
I have the following piece of code, which I want to make parallel in
I have the following piece of code which uses ICU macros in order to
I have the following piece of code which uses the ALAssets library to store
I have the following piece of code which helps me to write a bunch
I have the following piece of code which play the flv file, <embed align=middle
I have found the following piece of code which uses XmlHttpRequest to read JSON

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.