ruby-pytstをRubyForgeに登録

Posted by yatsu Mon, 04 Jun 2007 07:00:13 GMT

ruby-pytstのソースが行方不明になっていたので、RubyForgeに登録しました。

RubyForge: ruby-pytst

初めてRubyForgeに登録したのですが、WikiとSVNブラウザがエラーになります。なぜでしょうね?

ruby-pytstは、Wiki文章中のキーワード一括置換のために毎日使用していますが、半年以上、安定動作しています。

以下、READMEを載せておきます。

ruby-pytst

ruby-pytst is a port of pytst to Ruby. It is a implementation of Ternary Search Trie (TST) and it also supports scanning by the Aho-Corasick algorithm. It is built with SWIG.

This software is distributed under LGPL.

I have successfully built on the following environments.

  • (1) Mac OS X 10.4, Ruby 1.8.4 (Fink)
  • (2) Fedora Core 5, Ruby 1.8.4

tst_wrap.cxx in this package was generated on (1) with SWIG 1.3.19.

Author: Masaki Yatsu <yatsu at yatsu.info>

Installation

Generate the Makefile.

% ruby extconf.rb

Make it.

% make

Then, install it as root.

% make install

If you want to generate the Ruby wrapper codes from tst.i by yourself, execute the following at the beginning.

% swig -c++ -ruby -Iinclude tst.i

How to Use

There is no documentation at this point, but there is README.html of pytst in the doc directory.

There are examples in the example directory. You can run examples like this.

% ruby example/tokenize.rb

ruby-pytst supports following character sets.

  • ASCII ($KCODE = ‘NONE’)
  • EUC-JP ($KCODE = ‘EUC’)
  • Shift_JIS ($KCODE = ‘SJIS’)
  • UTF-8 ($KCODE = ‘UTF8’)

You have to specify $KCODE before you create a Pytst::TST.

There is the EUC-JP example in the example directory.

% ruby -Ke example/japanese_euc.rb

The option -Ke means $KCODE = 'EUC'.

See also

pytst:
http://nicolas.lehuen.com/download/pytst/

Ternary Search Trie:
http://en.wikipedia.org/wiki/Ternarysearchtries

Aho-Corasick algorithm:
http://en.wikipedia.org/wiki/Aho-Corasick_algorithm

Comments

Trackbacks

Use the following link to trackback from your own site:
http://yatsu_info/articles/trackback/21361

(leave url/email »)

   Comment Markup Help Preview comment