Dearly Beloved (用途のない備忘録): 9月 2018

ふたたび別府

iPhoneを壊して地獄を味わったのが2016年10月2日。
今から2年ほど前のことになる。

ふたたび、同じ場所でブログを書いている。

せっかく別府に来ているんだから、ブログなんて更新していないでなんかしなよ、と言われちゃうんだろうけど、環境変えて、普段とはちょっとちがうことに挑戦する、、、それだけで充分楽しい。

バス停の見えるスターバックスでのんびりしている。
この場所が好きなんだよなあ。
これから遠くへ移動してゆく人たちの醸し出す独特の空気。
バスが来るまでの間、楽しそうにおしゃべりしている女の子たちや、大きなスーツケースを引っ張って、空港か、別の都市へ移動してゆく人たちの、名残惜しさと慌ただしさが混じった感じが、本当に好き。

僕は大分、別府二つの町が妙に好きだ。
別府湾の美しさが好きだ。
町はのんびりしていて、どこか異国情調があるのは、宣教師たちの残り香でもただよっているのか?!
外国人観光客が目に付くのもアクセントになっているんだろうね。

これでもふらふらしてきたので、その報告は別途。

西伊豆に来ています(2)

秋の伊豆に来たのは初めてだった。
あいにく秋雨前線が本州を広範に覆っていて、初日、中日は時折雨も降った。

松崎へは休養できているためあまり観光はしない。
だが、今回は以前から気になっていた「火山の根」というものを見に出かけた。
火山の根とは、海底火山の溶岩が噴出するマグマの通り道が冷えて固まり、地上に露出したもの、らしい。
松崎の隣町、西伊豆町の浮島海岸(ふとうかいがん)で見ることが出来ると知り、クルマで移動。

細い道の先にごつごつとした岩だらけのビーチがあった。
その先に、筍のような「火山の根」が露出している。
ふーん、これなのな。

海は荒れていて、白波が岩を叩いている。
思いがけずな景勝地で、僕らはしばらくの間海を眺めていた。

松崎の帰り道、いつも通過している堂ヶ島海岸にも寄ってみた。
確かに、クルマがたくさん駐車しているだけあって、本当に景勝地なのな。
ビジターセンターに寄ったり、結構楽しめた。

波が荒くて、遊覧船の桟橋は波が下から吹き上がって、噴水のような状態。

まつざき荘を通過して、「プロヴァンスドすずき」でランチ。
地魚のスープ・ド・ポワソン。

パン。

豚すね肉のコンフィ。

デザートのガトーショコラ。

ゆっくりとランチを楽しんで、清掃の終わった部屋で昼寝。
2時間弱眠ったところで、外を散歩。
夕日が差してきた。

翌朝、窓の外は美しい青空が広がっていた。
駿河湾と空が混じり合って、青のグラデーションが鮮やかだった。
本当はこんな日がずっと続くと良いんだけどね。

伊豆は本当に良い場所だ。

西伊豆に来ています

ここしばらく睡眠障害(?)になっていました。
どうがんばっても6時間しか眠れない。
しかも12時とか過ぎると眠れなくなり、結果2時間しか眠れないとかになる。
だからまー、朝4時とか5時台から野菜を弄っている訳なんですが。
もーね、脳が疲れ切っているというか、神経が焼き切れそうな感じ。

連休を使って西伊豆に来ています。
もはや心のふるさととも言える松崎町。
オーシャンフロントの和室から海を眺めます。

温泉に入って、食事して、波の音を聞きながら眠りについたら11時間眠れました。
ひさしぶりに頭の中がスッキリ。

温泉に入って、ご飯を食べて、「火山の根」というものを見物し、プロヴァンス料理のランチを食べ、宿に戻って露天風呂で身体に付いた潮気を落とし、スッキリしてこの文章を書いてます。

ここはなんてか、1年に1週間は来たい場所。
疲れた時ほど、ここで色んなしがらみを洗い流してリフレッシュしたくなります。

Juliusのサーバインストールと動作テスト

Juliusをインストールする

いずれモジュールモードで動かしてみるつもりだ。

$ git clone https://github.com/julius-speech/julius

ファイルがgithubからローカルにコピーされる。

$ cd julius
$ ./configure

でコマンドラインが流れた後、こんな表示が出る。

****************************************************************
Julius/Julian libsent library rev.4.4.2.1:

- Audio I/O
    primary mic device API   : oss (Open Sound System compatible)
    available mic device API : oss
    supported audio format   : RAW and WAV only
    NetAudio support         : no
- Language Modeling
    class N-gram support     : yes
- Libraries
    file decompression by    : zlib library
- Process management
    fork on adinnet input    : no

Note: compilation time flags are now stored in "libsent-config".
        If you link this library, please add output of
        "libsent-config --cflags" to CFLAGS and
        "libsent-config --libs" to LIBS.
****************************************************************

じゃ、先に進みますか。

$ make
$ sudo make install

なんだか、一瞬で作業終了。
警告も出ていないのでインストールされた、、、のかな??

文法認識キットをインストール。

その前に、gitから大型ファイルをダウンロードするためのツールgit lfsをインストールしろというご命令が。。。
種々のサイトを参考にしてみたけれど、Ubuntu17でパッケージインストール成功報告がない。
で、とりあえずコマンドを叩いてみた。

$ sudo apt install git-lfs
[sudo] cooloctober のパスワード:
パッケージリストを読み込んでいます... 完了
依存関係ツリーを作成しています
状態情報を読み取っています... 完了
以下のパッケージが新たにインストールされます:
git-lfs
アップグレード: 0 個、新規インストール: 1 個、削除: 0 個、保留: 0 個。

…… 以下略 ……

なんかしれっとインストールできたっぽい。
ありがたや。

で、あらためて文法認識キットを導入してみる。

$ git lfs clone https://github.com/julius-speech/grammar-kit

導入成功。
そんなにでかいファイルじゃない感じもするしで、git lfsが必要だったのかよく分からん。

テストしてみよう。
サーバに置いてあるので、直接マイクを使えない環境。
テスト音声ファイルを使って動作テストを行う。

$ cd grammar-kit
$ julius -C testmic.jconf -nostrip -charconv SJIS UTF-8 -input rawfile

STAT: include config: testmic.jconf
STAT: include config: hmm_ptm.jconf
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Stat: read_binhmm: binary format HMM definition
Stat: read_binhmm: this HMM does not need multipath handling
Stat: init_phmm: defined HMMs: 7946
Stat: init_phmm: loading ascii hmmlist
Stat: init_phmm: logical names: 21424 in HMMList
Stat: init_phmm: base phones:    43 used in logical
Stat: init_phmm: finished reading HMM definitions
STAT: making pseudo bi/mono-phone for IW-triphone
Stat: hmm_lookup: 10 pseudo phones are added to logical HMM list
STAT: *** AM00 _default loaded
STAT: *** loading LM00 _default
STAT: reading [SampleGrammars/fruit/fruit.dfa] and [SampleGrammars/fruit/fruit.dict]...
Stat: init_voca: read 20 words
STAT: done
STAT: Gram #0 fruit registered
STAT: Gram #0 fruit: new grammar loaded, now mash it up for recognition
STAT: Gram #0 fruit: extracting category-pair constraint for the 1st pass
STAT: Gram #0 fruit: installed
STAT: Gram #0 fruit: turn on active
STAT: grammar update completed
STAT: *** LM00 _default loaded
STAT: ------
STAT: All models are ready, go for final fusion
STAT: [1] create MFCC extraction instance(s)
STAT: *** create MFCC calculation modules from AM
STAT: AM 0 _default: create a new module MFCC01
STAT: 1 MFCC modules created
STAT: [2] create recognition processing instance(s) with AM and LM
STAT: composing recognizer instance SR00 _default (AM00 _default, LM00 _default)
STAT: Building HMM lexicon tree
STAT: lexicon size: 201+0=201
STAT: coordination check passed
STAT: multi-gram: beam width set to 200 (guess) by lexicon change
STAT: wchmm (re)build completed
STAT: SR00 _default composed
STAT: [3] initialize for acoustic HMM calculation
Stat: outprob_init: all mixture PDFs are tied-mixture, use calc_tied_mix()
Stat: addlog: generating addlog table (size = 1953 kB)
Stat: addlog: addlog table generated
STAT: [4] prepare MFCC storage(s)
STAT: All init successfully done

STAT: ###### initialize input device
----------------------- System Information begin ---------------------
JuliusLib rev.4.4.2.1 (fast)

Engine specification:
- Base setup   : fast
- Supported LM : DFA, N-gram, Word
- Extension    :
- Compiled by : gcc -O6 -fomit-frame-pointer
Library configuration: version 4.4.2.1
- Audio input
    primary A/D-in driver   : oss (Open Sound System compatible)
    available drivers       : oss
    wavefile formats        : RAW and WAV only
    max. length of an input : 320000 samples, 150 words
- Language Model
    class N-gram support    : yes
    MBR weight support      : yes
    word id unit            : short (2 bytes)
- Acoustic Model
    multi-path treatment    : autodetect
- External library
    file decompression by   : zlib library
- Process hangling
    fork on adinnet input   : no
- built-in SIMD instruction set for DNN
    SSE AVX FMA
    AVX is available maximum on this cpu, use it

------------------------------------------------------------
Configuration of Modules

Number of defined modules: AM=1, LM=1, SR=1

Acoustic Model (with input parameter spec.):
- AM00 "_default"
    hmmfilename=model/phone_m/hmmdefs_ptm_gid.binhmm
    hmmmapfilename=model/phone_m/logicalTri

Language Model:
- LM00 "_default"
    grammar #1:
        dfa = SampleGrammars/fruit/fruit.dfa
        dict = SampleGrammars/fruit/fruit.dict

Recognizer:
- SR00 "_default" (AM00, LM00)

------------------------------------------------------------
Speech Analysis Module(s)

[MFCC01] for [AM00 _default]

Acoustic analysis condition:
           parameter = MFCC_E_D_N_Z (25 dim. from 12 cepstrum + energy, abs energy supressed with CMN)
    sample frequency = 16000 Hz
       sample period = 625 (1 = 100ns)
         window size = 400 samples (25.0 ms)
         frame shift = 160 samples (10.0 ms)
        pre-emphasis = 0.97
        # filterbank = 24
       cepst. lifter = 22
          raw energy = False
    energy normalize = False
        delta window = 2 frames (20.0 ms) around
         hi freq cut = OFF
         lo freq cut = OFF
    zero mean frame = OFF
           use power = OFF
                 CVN = OFF
                VTLN = OFF

    spectral subtraction = off

cep. mean normalization = yes, with per-utterance self mean
cep. var. normalization = no

    base setup from = Julius defaults

------------------------------------------------------------
Acoustic Model(s)

[AM00 "_default"]

HMM Info:
    7946 models, 3131 states, 3131 mpdfs, 8256 Gaussians are defined
          model type = has tied-mixture, context dependency handling ON
      training parameter = MFCC_E_N_D_Z
       vector length = 25
    number of stream = 1
         stream info = [0-24]
    cov. matrix type = DIAGC
       duration type = NULLD
        codebook num = 129
       max codebook size = 64
    max mixture size = 64 Gaussians
     max length of model = 5 states
     logical base phones = 43
       model skip trans. = not exist, no multi-path handling

AM Parameters:
        Gaussian pruning = beam (-gprune)
top N mixtures to calc = 2 / 64 (-tmix)
    short pause HMM name = "sp" specified, "sp" applied (physical) (-sp)
cross-word CD on pass1 = handle by approx. (use average prob. of same LC)

------------------------------------------------------------
Language Model(s)

[LM00 "_default"] type=grammar

DFA grammar info:
      8 nodes, 10 arcs, 9 terminal(category) symbols
      category-pair matrix: 60 bytes (1024 bytes allocated)

Vocabulary Info:
        vocabulary size = 20 words, 67 models
        average word len = 3.3 models, 10.1 states
       maximum state num = 21 nodes per word
       transparent words = not exist
       words under class = not exist

Parameters:
   found sp category IDs =

------------------------------------------------------------
Recognizer(s)

[SR00 "_default"] AM00 "_default" + LM00 "_default"

Lexicon tree:
    total node num =    201
    root node num =     20
    leaf node num =     20

    (-penalty1) IW penalty1 = +0.0
    (-penalty2) IW penalty2 = +0.0
    (-cmalpha)CM alpha coef = 0.050000

Search parameters:
        multi-path handling = no
    (-b) trellis beam width = 200 (-1 or not specified - guessed)
    (-bs)score pruning thres= disabled
    (-n)search candidate num= 1
    (-s) search stack size = 500
    (-m)    search overflow = after 2000 hypothesis poped
            2nd pass method = searching sentence, generating N-best
    (-b2) pass2 beam width = 30
    (-lookuprange)lookup range= 5 (tm-5 <= t     (-sb)2nd scan beamthres = 80.0 (in logscore)
    (-n)        search till = 1 candidates found
    (-output)    and output = 1 candidates out of above
    IWCD handling:
       1st pass: approximation (use average prob. of same LC)
       2nd pass: loose (apply when hypo. is popped and scanned)
    all possible words will be expanded in 2nd pass
    build_wchmm2() used
    lcdset limited by word-pair constraint
    progressive output on 1st pass
    short pause segmentation = off
            progout interval = 300 msec
    fall back on search fail = off, returns search failure

------------------------------------------------------------
Decoding algorithm:

    1st pass input processing = buffered, batch
    1st pass method = 1-best approx. generating indexed trellis
    output word confidence measure based on search-time scores

------------------------------------------------------------
FrontEnd:

Input stream:
                 input type = waveform
               input source = waveform file
              input filelist = (none, get file name from stdin)
              sampling freq. = 16000 Hz required
             threaded A/D-in = supported, off
       zero frames stripping = off
             silence cutting = off
        long-term DC removal = off
        level scaling factor = 1.00 (disabled)
          reject short input = off
          reject long input = off

----------------------- System Information end -----------------------

Notice for feature extraction (01),
    *************************************************************
    * Cepstral mean normalization for batch decoding:           *
    * per-utterance mean will be computed and applied.          *
    *************************************************************

enter filename->sample.wav
Stat: adin_file: input speechfile: sample.wav
STAT: 33984 samples (2.12 sec.)
STAT: ### speech analysis (waveform -> MFCC)
pass1_best: ~~リンゴ 3 個をください~~
sentence1: ~~リンゴ 3 個をください~~

認識できているみたいです。

(備忘録)現在の開発環境へのアクセス方法を記載

セキュリティの観点から、ポートはsshだけしか開けていない。
WebブラウザでWebアプリにアクセスする場合も、sshのポートフォワーディングで行う。

手順:
■sshでサーバにアクセスして通信確立。
■Webアプリケーションを使う場合は、あらかじめコマンドラインでプログラムを立ち上げておく。
■Webブラウザでアクセスする時は、http://localhost:8080/hogehoge/

これでWebブラウザになにか表示されるはず。

nginxの設定を変更

なかなかよく分かっていなかったnginx.confの書き方。
現在はこんな感じになっている。

=== 以下、設定サンプル ===

worker_processes 1;

events {
    worker_connections 1024;
}

http {
    include       mime.types;

    client_max_body_size 1024m;

    default_type application/octet-stream;

    sendfile        off;
    keepalive_timeout 65;

    index index.html index.htm;

    server {
        listen       80;
        server_name localhost;
    proxy_read_timeout 10m;
    autoindex on;

    location = /favicon.ico {
        log_not_found off;
    }

    location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico)$ {
            root   /var/www/static/;
            access_log        off;
            expires           30d;
    }

        location / {
        proxy_pass http://localhost:53000;    # tornado(アプリ)が53000でListenしている前提
           break;
        }

    location /echo {
        alias /var/www/apps/echo;
           proxy_pass http://localhost:53001;        # tornado(アプリ)がWebsocketを使っている場合
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        index echo.html;
    }

    location /chat {
        alias /var/www/apps/chat;
                proxy_pass http://localhost:53001;              # tornado(アプリ)がWebsocketを使っている場合
                proxy_http_version 1.1;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection "upgrade";
        index main.html;
    }

    location /data/ {
        proxy_http_version 1.1;
        expires off;
        proxy_request_buffering off;
        chunked_transfer_encoding on;
        root /var/www/data/;
    }
    }

}

=== ここまで ===

ハマったポイントとしては、
location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico)$ {
   root   /var/www/static/;
　　以下略
}

と設定してあるので、各アプリケーション絡みのjpgやcssファイルは/www/static/適当なフォルダ/に保存しないと動かないこと。

下の設定は、Websocketを使うアプリケーションのもの。

location /chat {
    alias /var/www/apps/chat;
        proxy_pass http://localhost:53001;
　　　　　　　　# tornado(アプリ)がWebsocketを使っている場合
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    index main.html;
}

Webブラウザから http://hogehoge.com/chat と入力するとポート番号53001で待機しているtornadoアプリケーションにフォワードされる。
アプリケーションが動かなくてすごく悩まされたのは、alias設定を書いていなかったため。
そりゃそうだ。
ポート番号53001に来たパケットをフォワードしようにも、どこのアプリケーションにフォワードするのか記述がなければ動くわけないよな。。。。(>_<)

ちなみに上記の設定では、http://hogehoge.com/data/にアクセスすると、/var/www/data/に行く設定になっている。

で、平成最後の８月が終わったわけで

今年の夏は暑かった。
もう暑いなんてもんじゃなくて、日中歩いていたら死んじゃうんじゃない?って思うほど。
9月に入って、今日は最高気温26℃だそう。

ブログを非公開にしてから、更新頻度は急減しましたね。
アホが余計なコメントをつけてくるから、、、、

8月はディズニーリゾートに遊びに来た甥が泊まっていったり。
お盆休みで田舎を2往復したりとか。
父親が入院して騒動になったりとか。
白菜の植え付けで2週間毎日大騒ぎしていたりとか。

彼氏と会ったのは1度きりだったけれど、元気そうだった。

そんなこんなで、平成最後の8月は終わっていった。
来年改元だそうだけど、昭和の時代、天皇がいつ崩御するのだろうとメディアと「自称」評論家という連中が下衆な勘ぐりをこね回していた頃よりもずっと良いと思う。
いつか来る最後よりも、分かっている最後の方が大切に向き合うことが出来ると思うから。

登録: 投稿 (Atom)