in an effort to solve question #3367795 here on SO i have to cope

Question

0

Asked: May 16, 20262026-05-16T01:50:46+00:00 2026-05-16T01:50:46+00:00

in an effort to solve question #3367795 here on SO i have to cope

0

in an effort to solve question #3367795 here on SO i have to cope with a number of subproblems. one of these is: in said algorithm (levenshtein distance), several arrays are allocated in memory and initialized with the lines

cdef char   *m1     = <char *>calloc(   blen + 2,    sizeof( char ) )
cdef char   *m2     = <char *>calloc(   blen + 2,    sizeof( char ) )
cdef char   *m3     = <char *>malloc( ( blen + 2 ) * sizeof( char ) )
#.........................................................................
for i from 0 <= i <= blen:
  m2[ i ] = i
  <...snip...>

blen here refers to the length of a Python bytes variable. now as far as i understand the algorithm (see my original post for the full code) and as the code for the initialization of m2 clearly shows, these arrays are meant to hold integer numbers, not characters, so one would think the correct allocations should look like

cdef int    *m3     = <int *>malloc( ( blen + 2 ) * sizeof( int ) )

and so on. can anyone with a background in C elucidate to me why char is used? also, maybe more for people inclined to Cython, why is there a cast <char *>? one would think that char *x = malloc( ... ) should suffice to define x.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T01:50:47+00:00

Quite simply, to save memory — but please note carefully that declaring these arrays as char limits the result distance to either 127 or 255, depending on whether the C compiler defaults to signed char or unsigned char respectively. In C, char is an integer type — you don’t need an ord() to get its integer value.

Your original code contains no mention of this limitation. Note that if a char overflows, it does so silently and the code will produce incorrect results — 127 + 1 -> -128 (signed); 255 + 1 -> 0 (unsigned).

You didn’t respond to my comment on your original question: “””What are the (a) maximum (b) average sizes of your strings? Do you really need to do the whole O(M*N) thing if the two strings are nothing like each other?””” ….. Please answer that now (edit your question); had you done so then, you would have had this question answered then.

Update: Reading the original post again, I’ve noticed a problem: The code that reads

m1, m2 = m2, m1
strcpy( m3, m2 )

is WRONG on three grounds: (1) it doesn’t shuffle the rows properly (should do strcpy() before swapping m1 and m2) (2) strcpy() will not copy anything beyond the first null (zero byte) (3) there is no need to copy anything, just shuffle the pointers

m3, m2, m1 = m2, m1, m3

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

in an effort to solve question #3367795 here on SO i have to cope

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply