Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6796885
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T18:32:30+00:00 2026-05-26T18:32:30+00:00

I asked before for a simple solution to my problem (using sphinx search service)

  • 0

I asked before for a simple solution to my problem (using sphinx search service) but I got nowhere…

someone has kindly provided me with this code

<?php
/**
 * $Project: GeoGraph $
 * $Id$
 * 
 * GeoGraph geographic photo archive project
 * This file copyright (C) 2005  Barry Hunter (geo@barryhunter.co.uk)
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version 2
 * of the License, or (at your option) any later version.
 * 
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 * 
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
 */



/**
* Provides the methods for updating the worknet tables
*
* @package Geograph
* @author Barry Hunter <geo@barryhunter.co.uk>
* @version $Revision$
*/

function addTwoLetterPhrase($phrase) {
    global $w2;
    $w2[$phrase] = (isset($w2[$phrase]))?($w2[$phrase]+1):1; 
}

function addThreeLetterPhrase($phrase) {
    global $w3;
    $w3[$phrase] = (isset($w3[$phrase]))?($w3[$phrase]+1):1; 
}

function updateWordnet(&$db,$text,$field,$id) {
    global $w1,$w2,$w3;

    $alltext = strtolower(preg_replace('/\W+/',' ',str_replace("'",'',$text)));


    if (strlen($text)< 1)
        return;


    $words = preg_split('/ /',$alltext);

    $w1 = array();
    $w2 = array();
    $w3 = array();

    //build a list of one word phrases
    foreach ($words as $word) {
        $w1[$word] = (isset($w1[$word]))?($w1[$word]+1):1; 
    }

    //build a list of two word phrases
        $text = $alltext;
    $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text); 
        $text = $alltext;
        $text = preg_replace('/(\w+)/','',$text,1);
    $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text);

    //build a list of three word phrases
        $text = $alltext;
    $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);  
        $text = $alltext;
        $text = preg_replace('/(\w+)/','',$text,1);
    $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);  
        $text = $alltext;
        $text = preg_replace('/(\w+) (\w+)/','',$text,1);
    $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);



    foreach ($w1 as $word=>$count) {
        $db->Execute("insert into wordnet1 set gid = $id,words = '$word',$field = $count");// ON DUPLICATE KEY UPDATE $field=$field+$count");
    }
    foreach ($w2 as $word=>$count) {
        $db->Execute("insert into wordnet2 set gid = $id,words = '$word',$field = $count");
    }   
    foreach ($w3 as $word=>$count) {
        $db->Execute("insert into wordnet3 set gid = $id,words = '$word',$field = $count");
    }   
}


?>

It works fine and does almost exactly what I need……. except…. it is not utf8 friendly… I mean… it splits whole words into parts (on special chars) where it shouldn’t!

so my guess is I should use multibyte functions instead of regular preg_replace…

I tried to replace preg_replace with mb_ereg_replace but it is not working as it should… at least not for 2 and 3 words phrases

any ideas?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T18:32:30+00:00Added an answer on May 26, 2026 at 6:32 pm

    PCRE can deal with UTF-8. You just need to add the /u modifier in each regex.

    http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

    (You could also use \pL+ in place of \w+, but the flag is sufficient in recent PCRE versions.)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I know this question has been asked before, but I ran into a problem.
This is a question that has been asked before, but unfortunately no solution seems
I'm sure this has been asked before, but I couldn't find a working solution
I know this is probably simple and has probably been asked before, but I'm
This question has been asked before ( link ) but I have slightly different
This has been asked before (question no. 308581) , but that particular question and
Probably it has been asked before but I cannot find an answer. Table Data
I'm sure this has been asked before, but I can't find it. What are
I know this specific question has been asked before , but I am not
Hi It might be asked before but I am new to Jquery and using

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.