fluentdの<parse>と<inject>でtime_formatを指定しなかった場合の挙動と内部処理

2021-01-23 fluentd

挙動

fluent/fluentdのDockerイメージで試す。

$ vi fluent.conf
<source>
  @type tail
  path /home/fluent/test.log
  pos_file /home/fluent/test.log.pos
  tag test.log
  <parse>
    @type json
    time_key ts
    time_type string
  </parse>
</source>

<match test.log>
  @type stdout
  <inject>
    time_key ts2
    time_type string
  </inject>
</match>

$ touch test.log
$ docker run -it --rm -v $(pwd)/fluent.conf:/fluentd/etc/fluent.conf -v $(pwd)/test.log:/home/fluent/test.log fluent/fluentd:v1.12-1
# echo '{"ts": "2021-01-01T01:23:45", "data": 123}' >> test.log
2021-01-01 01:23:45.000000000 +0000 test.log: {"data":123,"ts2":"2021-01-01T01:23:45+00:00"}

# echo '{"ts": "2021-01-01T01:23:45", "data": 123}' >> test.log
2021-01-01 01:23:45.000000000 +0000 test.log: {"data":123,"ts2":"2021-01-01T01:23:45+00:00"}

# echo '{"ts": "Tue Jan 01 01:23:45 GMT 2021", "data": 123}' >> test.log
2021-01-01 01:23:45.000000000 +0000 test.log: {"data":123,"ts2":"2021-01-01T01:23:45+00:00"}

$ echo '{"ts": "01:23:45", "data": 123}' >> test.log
2021-01-22 01:23:45.000000000 +0000 test.log: {"data":123,"ts2":"2021-01-22T01:23:45+00:00"}

time_keyであるtsフィールドは柔軟にパースされ、injectされたts2フィールドはtsのフォーマットとも関係ないフォーマットとなっている。

内部処理

parse

time_key (default: time)フィールドの値をレコードのtimeとして扱い、値が存在しなければ現在時刻が使われる。 time_type (default: float)がstringの場合はtime_formatで初期化されたTimeParserでパースする。time_formatを指定しなかった場合、EventTime.parse()が呼ばれる。この関数はrubyのTime.parse()で文字列をパースしEventTimeオブジェクトを作って返す。

このtimeはemit()する際tagやレコードと共に渡されるが、 keep_time_key true (default: false) にしないとレコード自体からは削除される。

inject

time_key (default: time)フィールドにtimeの値を追加する。v0.12系ではinclude_time_keyで追加できた。<parse>と同様にtime_typeとtime_formatを指定でき、stringの場合はtime_formatで初期化されたTimeFormatterでフォーマットする。time_formatを指定しなかった場合、 ISO8601フォーマット2008-08-31T12:34:56+09:00となる。