Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3279690
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T19:37:16+00:00 2026-05-17T19:37:16+00:00

This website offers the Schinke Latin stemming algorithm for download to use it in

  • 0

This website offers the “Schinke Latin stemming algorithm” for download to use it in the Snowball stemming system.

I want to use this algorithm, but I don’t want to use Snowball.

The good thing: There’s some pseudocode on that page which you could translate to a PHP function. This is what I’ve tried:

<?php
function stemLatin($word) {
    // output = array(NOUN-BASED STEM, VERB-BASED STEM)
    // DEFINE CLASSES BEGIN
    $queWords = array('atque', 'quoque', 'neque', 'itaque', 'absque', 'apsque', 'abusque', 'adaeque', 'adusque', 'denique', 'deque', 'susque', 'oblique', 'peraeque', 'plenisque', 'quandoque', 'quisque', 'quaeque', 'cuiusque', 'cuique', 'quemque', 'quamque', 'quaque', 'quique', 'quorumque', 'quarumque', 'quibusque', 'quosque', 'quasque', 'quotusquisque', 'quousque', 'ubique', 'undique', 'usque', 'uterque', 'utique', 'utroque', 'utribique', 'torque', 'coque', 'concoque', 'contorque', 'detorque', 'decoque', 'excoque', 'extorque', 'obtorque', 'optorque', 'retorque', 'recoque', 'attorque', 'incoque', 'intorque', 'praetorque');
    $suffixesA = array('ibus, 'ius, 'ae, 'am, 'as, 'em', 'es', ia', 'is', 'nt', 'os', 'ud', 'um', 'us', 'a', 'e', 'i', 'o', 'u');
    $suffixesB = array('iuntur', 'beris', 'erunt', 'untur', 'iunt', 'mini', 'ntur', 'stis', 'bor', 'ero', 'mur', 'mus', 'ris', 'sti', 'tis', 'tur', 'unt', 'bo', 'ns', 'nt', 'ri', 'm', 'r', 's', 't');
    // DEFINE CLASSES END
    $word = strtolower(trim($word)); // make string lowercase + remove white spaces before and behind
    $word = str_replace('j', 'i', $word); // replace all <j> by <i>
    $word = str_replace('v', 'u', $word); // replace all <v> by <u>
    if (substr($word, -3) == 'que') { // if word ends with -que
        if (in_array($word, $queWords)) { // if word is a queWord
            return array($word, $word); // output queWord as both noun-based and verb-based stem
        }
        else {
            $word = substr($word, 0, -3); // remove the -que
        }
    }
    foreach ($suffixesA as $suffixA) { // remove suffixes for noun-based forms (list A)
        if (substr($word, -strlen($suffixA)) == $suffixA) { // if the word ends with that suffix
            $word = substr($word, 0, -strlen($suffixA)); // remove the suffix
            break; // remove only one suffix
        }
    }
    if (strlen($word) >= 2) { $nounBased = $word; } else { $nounBased = ''; } // add only if word contains two or more characters
    foreach ($suffixesB as $suffixB) { // remove suffixes for verb-based forms (list B)
        if (substr($word, -strlen($suffixA)) == $suffixA) { // if the word ends with that suffix
            switch ($suffixB) {
                case 'iuntur', 'erunt', 'untur', 'iunt', 'unt': $word = substr($word, 0, -strlen($suffixB)).'i'; break; // replace suffix by <i>
                case 'beris', 'bor', 'bo': $word = substr($word, 0, -strlen($suffixB)).'bi'; break; // replace suffix by <bi>
                case 'ero': $word = substr($word, 0, -strlen($suffixB)).'eri'; break; // replace suffix by <eri>
                default: $word = substr($word, 0, -strlen($suffixB)); break; // remove the suffix
            }
            break; // remove only one suffix
        }
    }
    if (strlen($word) >= 2) { $verbBased = $word; } else { $verbBased = ''; } // add only if word contains two or more characters
    return array($nounBased, $verbBased);
}
?>

My questions:

1) Will this code work correctly? Does it follow the algorithm’s rules?

2) How could you improve the code (performance)?

Thank you very much in advance!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T19:37:17+00:00Added an answer on May 17, 2026 at 7:37 pm

    No, your function will not work, it contains syntax errors. For example you have unclosed quotes and you use a wrong switch syntax.

    Here is my rewrite of the function. As the pseudoalgorithm on that page isn’t really precise I had to do some interpreting. I interpreted it in a way that the examples mentioned in this article work.

    I also did some optimizations. The first one is that I define the word and suffix arrays static. Thus all calls to this function share the same arrays which should be good fore performance 😉

    Furthermore I adjusted the arrays so they can be used more effective. I changed the $queWords array so it can be used for a fast hash-table lookup, not a slow in_array. Furthermore I have saved the lengths for the suffixes in the array. Thus you don’t need to compute them at runtime (which is really, really slow). I may have made more minor optimizations.

    I don’t know how much faster this code is, but it should be much faster. Furthermore it now works on the examples provided.

    Here is the code:

    <?php
        function stemLatin($word) {
            static $queWords = array(
                'atque'         => 1,
                'quoque'        => 1,
                'neque'         => 1,
                'itaque'        => 1,
                'absque'        => 1,
                'apsque'        => 1,
                'abusque'       => 1,
                'adaeque'       => 1,
                'adusque'       => 1,
                'denique'       => 1,
                'deque'         => 1,
                'susque'        => 1,
                'oblique'       => 1,
                'peraeque'      => 1,
                'plenisque'     => 1,
                'quandoque'     => 1,
                'quisque'       => 1,
                'quaeque'       => 1,
                'cuiusque'      => 1,
                'cuique'        => 1,
                'quemque'       => 1,
                'quamque'       => 1,
                'quaque'        => 1,
                'quique'        => 1,
                'quorumque'     => 1,
                'quarumque'     => 1,
                'quibusque'     => 1,
                'quosque'       => 1,
                'quasque'       => 1,
                'quotusquisque' => 1,
                'quousque'      => 1,
                'ubique'        => 1,
                'undique'       => 1,
                'usque'         => 1,
                'uterque'       => 1,
                'utique'        => 1,
                'utroque'       => 1,
                'utribique'     => 1,
                'torque'        => 1,
                'coque'         => 1,
                'concoque'      => 1,
                'contorque'     => 1,
                'detorque'      => 1,
                'decoque'       => 1,
                'excoque'       => 1,
                'extorque'      => 1,
                'obtorque'      => 1,
                'optorque'      => 1,
                'retorque'      => 1,
                'recoque'       => 1,
                'attorque'      => 1,
                'incoque'       => 1,
                'intorque'      => 1,
                'praetorque'    => 1,
            );
            static $suffixesNoun = array(
                'ibus' => 4,
                'ius'  => 3,
                'ae'   => 2,
                'am'   => 2,
                'as'   => 2,
                'em'   => 2,
                'es'   => 2,
                'ia'   => 2,
                'is'   => 2,
                'nt'   => 2,
                'os'   => 2,
                'ud'   => 2,
                'um'   => 2,
                'us'   => 2,
                'a'    => 1,
                'e'    => 1,
                'i'    => 1,
                'o'    => 1,
                'u'    => 1,
            );
            static $suffixesVerb = array(
                'iuntur' => 6,
                'beris'  => 5,
                'erunt'  => 5,
                'untur'  => 5,
                'iunt'   => 4,
                'mini'   => 4,
                'ntur'   => 4,
                'stis'   => 4,
                'bor'    => 3,
                'ero'    => 3,
                'mur'    => 3,
                'mus'    => 3,
                'ris'    => 3,
                'sti'    => 3,
                'tis'    => 3,
                'tur'    => 3,
                'unt'    => 3,
                'bo'     => 2,
                'ns'     => 2,
                'nt'     => 2,
                'ri'     => 2,
                'm'      => 1,
                'r'      => 1,
                's'      => 1,
                't'      => 1,
            );
    
            $stems = array($word, $word);
    
            $word = strtr(strtolower(trim($word)), 'jv', 'iu'); // trim, lowercase and j => i, v => u
    
            if (substr($word, -3) == 'que') {
                if (isset($queWords[$word])) {
                    return array($word, $word);
                }
                $word = substr($word, 0, -3);
            }
    
            foreach ($suffixesNoun as $suffix => $length) {
                if (substr($word, -$length) == $suffix) {
                    $tmp = substr($word, 0, -$length);
    
                    if (isset($tmp[1]))
                        $stems[0] = $tmp;
                    break;
                }
            }
    
            foreach ($suffixesVerb as $suffix => $length) {
                if (substr($word, -$length) == $suffix) {
                    switch ($suffix) {
                        case 'iuntur':
                        case 'erunt':
                        case 'untur':
                        case 'iunt':
                        case 'unt':
                            $tmp = substr_replace($word, 'i', -$length, $length);
                        break;
                        case 'beris':
                        case 'bor':
                        case 'bo':
                            $tmp = substr_replace($word, 'bi', -$length, $length);
                        break;
                        case 'ero':
                            $tmp = substr_replace($word, 'eri', -$length, $length);
                        break;
                        default:
                            $tmp = substr($word, 0, -$length);
                    }
    
                    if (isset($tmp[1]))
                        $stems[1] = $tmp;
                    break;
                }
            }
    
            return $stems;
        }
    
        var_dump(stemLatin('aquila'));
        var_dump(stemLatin('portat'));
        var_dump(stemLatin('portis'));
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am not familiar with this website but I am desperately seeking help with
I am trying to use JQuery in my ASP.Net 2.0 website in this scenario:
we have a website that offers licenses online. I hope this is the right
I want to download songs and photos with my iPhone from a website that
Just came across this website . Feature 9 is memory management and they claim
The people on this website seem to know everything so I figured I would
So I have this website that will be accessed via an SSL connection (
According to this website , you can change to command key sequence used by
I have a website I've built in VS2005, C#, .NET 2.0. This website does
I'm facing a problem with IE6. I took the toggle function from this website

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.