As phrased in the question, I’m looking for a free and/or open-source text-segmentation algorithm for Chinese, I do understand it is a very difficult task to solve, as there are many ambiguities involed. I know there’s google’s API, but well it is rather a black-box, i.e. not many information of what it is doing are passing through.
As phrased in the question, I’m looking for a free and/or open-source text-segmentation algorithm
Share
The keyword
text-segmentation for Chineseshould be中文分词in Chinese.Good and active open-source text-segmentation algorithm :
C#,SnapshotJavaC/C++, Java, C#,DemoC, PHP, PostgreSQLICTCLAS,DemoJavaJava,DemoPython, Java,DemopythonOther
Sample
Google Chrome (Chromium) :
src,cc_cedict.txt (73,145 Chinese words/pharases)In
text fieldortextareaof Google Chrome with Chinese sentences, pressCtrl+← or Ctrl+→
Double clickon中文分词指的是将一个汉字序列切分成一个一个单独的词