菜鸟笔记
提升您的技术认知

logstash简介及基本操作-ag真人游戏

一:logstash概述:

简单来说logstash就是一根具备实时数据传输能力的管道,负责将数据信息从管道的输入端传输到管道的输出端;与此同时这根管道还可以让你根据自己的需求在中间加上滤网,logstash提供里很多功能强大的滤网以满足你的各种应用场景。

logstash常用于日志系统中做日志采集设备,最常用于elk中作为日志收集器使用

二:logstash作用:

集中、转换和存储你的数据,是一个开源的服务器端数据处理管道,可以同时从多个数据源获取数据,并对其进行转换,然后将其发送到你最喜欢的“存储

三:logstash的架构:

logstash的基本流程架构:input  |  filter  |  output 如需对数据进行额外处理,filter可省略。

3.1 input(输入):采集各种样式,大小和相关来源数据,从各个服务器中收集数据。

数据往往以各种各样的形式,或分散或集中地存在于很多系统中。logstash 支持各种输入选择 ,可以在同一时间从众多常用来源捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、web 应用、数据存储以及各种 aws 服务采集数据。


3.2 filter(过滤器)

用于在将event通过output发出之前对其实现某些处理功能。grok。
grok:用于分析结构化文本数据。目前 是logstash中将非结构化数据日志数据转化为结构化的可查询数据的不二之选
[root@node1 ~]# rpm -ql logstash | grep "patterns$"  grok定义模式结构化的位置。
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/grok-patterns
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/mcollective-patterns
[root@node1 ~]# 

3.3 output(输出):将我们过滤出的数据保存到那些数据库和相关存储中,

3.4 总结:

inpust:必须,负责产生事件(inputs generate events),常用:file、syslog、redis、beats(如:filebeats)
filters:可选,负责数据处理与转换(filters modify them),常用:grok、mutate、drop、clone、geoip
outpus:必须,负责数据输出(outputs ship them elsewhere),常用:elasticsearch、file、graphite、statsd



四:安装logstash环境。依据rpm包进行下载。
准备了四个节点,实验备用;

节点 ip
192.168.126.128 node1
192.168.126.129 node2
192.168.126.130 node3
192.168.126.131 node4
[root@node1 ~]# rpm -ivh logstash-7.9.1.rpm 
warning: logstash-7.9.1.rpm: header v4 rsa/sha512 signature, key id d88e42b4: nokey
preparing...                          ################################# [100%]
updating / installing...
   1:logstash-1:7.9.1-1               ################################# [100%]
using provided startup.options file: /etc/logstash/startup.options
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/pleaserun-0.0.31/lib/pleaserun/platform/base.rb:112: warning: constant ::fixnum is deprecated
successfully created system startup script for logstash
[root@node1 ~]# vim /etc/profile.d/logstash.sh      添加可执行文件路径
export path=$path:/usr/share/logstash/bin
[root@node1 ~]# source /etc/profile.d/logstash.sh
[root@node1 ~]# java -version    基于logstash是jruby语言编写,即需要java环境。
openjdk version "1.8.0_262"
openjdk runtime environment (build 1.8.0_262-b10)
openjdk 64-bit server vm (build 25.262-b10, mixed mode)

五:logstash的工作流程。

input {
       从哪个地方读取,输入数据。
   
}
filter {
      依据grok模式对数据进行分析结构化
   
}
output {
      将分析好的数据输出存储到哪些地方
  
}

实例一:我们以标准输入,来输出数据。

[root@node1 ~]# cd /etc/logstash/conf.d/  默认logstash的配制文件在这个目录下
[root@node1 conf.d]# ls
[root@node1 conf.d]# vim shil.conf
input {
  
  stdin {
     标准输入
  }
}
output {
  
  stdout {
     标准输入
    codec => rubydebug   编码格式ruby
  }
}
[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug  使用--config.debug进行验证配置是否有错误
warning: could not find logstash.yml which is typically located in $ls_home/config or /etc/logstash. you can specify the path using --path.settings. continuing using the defaults
could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. using default config which logs errors to the console
[warn ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! no config info will be logged.
[info ] 2020-10-13 13:11:00.261 [main] runner - starting logstash {
  "logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc openjdk 64-bit server vm 25.262-b10 on 1.8.0_262-b10  indy  jit [linux-x86_64]"}
[info ] 2020-10-13 13:11:00.319 [main] writabledirectory - creating directory {
  :setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[info ] 2020-10-13 13:11:00.340 [main] writabledirectory - creating directory {
  :setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[warn ] 2020-10-13 13:11:00.803 [logstash::runner] multilocal - ignoring the 'pipelines.yml' file because modules or command line options are specified
[info ] 2020-10-13 13:11:00.845 [logstash::runner] agent - no persistent uuid file found. generating new uuid {
  :uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"}
[info ] 2020-10-13 13:11:02.670 [converge pipelineaction::create
] reflections - reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values [info ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - starting pipeline { :pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"} [info ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - pipeline java execution initialization time { "seconds"=>0.87} [info ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - pipeline started { "pipeline.id"=>"main"} the stdin plugin is now waiting for input: [info ] 2020-10-13 13:11:04.682 [agent thread] agent - pipelines running { :count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [info ] 2020-10-13 13:11:04.935 [api webserver] agent - successfully started logstash api endpoint { :port=>9600} [root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug warning: could not find logstash.yml which is typically located in $ls_home/config or /etc/logstash. you can specify the path using --path.settings. continuing using the defaults could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. using default config which logs errors to the console [warn ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! no config info will be logged. [info ] 2020-10-13 13:11:00.261 [main] runner - starting logstash { "logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc openjdk 64-bit server vm 25.262-b10 on 1.8.0_262-b10 indy jit [linux-x86_64]"} [info ] 2020-10-13 13:11:00.319 [main] writabledirectory - creating directory { :setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"} [info ] 2020-10-13 13:11:00.340 [main] writabledirectory - creating directory { :setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"} [warn ] 2020-10-13 13:11:00.803 [logstash::runner] multilocal - ignoring the 'pipelines.yml' file because modules or command line options are specified [info ] 2020-10-13 13:11:00.845 [logstash::runner] agent - no persistent uuid file found. generating new uuid { :uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"} [info ] 2020-10-13 13:11:02.670 [converge pipelineaction::create
] reflections - reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values [info ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - starting pipeline { :pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"} [info ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - pipeline java execution initialization time { "seconds"=>0.87} [info ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - pipeline started { "pipeline.id"=>"main"} the stdin plugin is now waiting for input: [info ] 2020-10-13 13:11:04.682 [agent thread] agent - pipelines running { :count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [info ] 2020-10-13 13:11:04.935 [api webserver] agent - successfully started logstash api endpoint { :port=>9600} ^c[warn ] 2020-10-13 13:16:23.243 [sigint handler] runner - sigint received. shutting down. [info ] 2020-10-13 13:16:24.389 [converge pipelineaction::stop
] javapipeline - pipeline terminated { "pipeline.id"=>"main"} ^c[fatal] 2020-10-13 13:16:24.429 [sigint handler] runner - sigint received. terminating immediately.. [error] 2020-10-13 13:16:24.490 [logstash::runner] logstash - org.jruby.exceptions.threadkill [root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf warning: could not find logstash.yml which is typically located in $ls_home/config or /etc/logstash. you can specify the path using --path.settings. continuing using the defaults could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. using default config which logs errors to the console [info ] 2020-10-13 13:16:44.536 [main] runner - starting logstash { "logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc openjdk 64-bit server vm 25.262-b10 on 1.8.0_262-b10 indy jit [linux-x86_64]"} [warn ] 2020-10-13 13:16:44.920 [logstash::runner] multilocal - ignoring the 'pipelines.yml' file because modules or command line options are specified [info ] 2020-10-13 13:16:46.360 [converge pipelineaction::create
] reflections - reflections took 37 ms to scan 1 urls, producing 22 keys and 45 values [info ] 2020-10-13 13:16:47.108 [[main]-pipeline-manager] javapipeline - starting pipeline { :pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"} [info ] 2020-10-13 13:16:47.839 [[main]-pipeline-manager] javapipeline - pipeline java execution initialization time { "seconds"=>0.73} [info ] 2020-10-13 13:16:47.900 [[main]-pipeline-manager] javapipeline - pipeline started { "pipeline.id"=>"main"} the stdin plugin is now waiting for input: [info ] 2020-10-13 13:16:48.010 [agent thread] agent - pipelines running { :count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [info ] 2020-10-13 13:16:48.227 [api webserver] agent - successfully started logstash api endpoint { :port=>9600} hello world 我们输入一些字段。 { "host" => "node1", 当前主机 "message" => "hello world", 发布的消息 "@version" => "1", 版本号 "@timestamp" => 2020-10-13t06:08:07.476z }

实例二:我们通过grok来对日志进行分析,读取,标准输出。

2.1我们自定义gork模式对日志进行过滤。
语法格式:
       %{
  syntax:semantic}
               syntax:预定义模式名称;
               semantic:匹配到的文本的自定义标识符;
[root@node1 conf.d]# vim groksimple.conf
input {
  
    stdin {
  }
}
filter {
  
    grok {
  
    match => {
   "message" => "%{ip:clientip} %{word:method} %{uripathparam:request} %{number:bytes} %{number:duration}" }
 }
}
output {
  
  stdout {
  
  codec => rubydebug
  }
}
[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/groksimple.conf
warning: could not find logstash.yml which is typically located in $ls_home/config or /etc/logstash. you can specify the path using --path.settings. continuing using the defaults
could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. using default config which logs errors to the console
[info ] 2020-10-13 14:29:41.936 [main] runner - starting logstash {
  "logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc openjdk 64-bit server vm 25.262-b10 on 1.8.0_262-b10  indy  jit [linux-x86_64]"}
[warn ] 2020-10-13 14:29:42.412 [logstash::runner] multilocal - ignoring the 'pipelines.yml' file because modules or command line options are specified
[info ] 2020-10-13 14:29:44.025 [converge pipelineaction::create
] reflections - reflections took 42 ms to scan 1 urls, producing 22 keys and 45 values [info ] 2020-10-13 14:29:44.995 [[main]-pipeline-manager] javapipeline - starting pipeline { :pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/groksimple.conf"], :thread=>"#"} [info ] 2020-10-13 14:29:45.749 [[main]-pipeline-manager] javapipeline - pipeline java execution initialization time { "seconds"=>0.74} [info ] 2020-10-13 14:29:45.820 [[main]-pipeline-manager] javapipeline - pipeline started { "pipeline.id"=>"main"} the stdin plugin is now waiting for input: [info ] 2020-10-13 14:29:45.902 [agent thread] agent - pipelines running { :count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [info ] 2020-10-13 14:29:46.098 [api webserver] agent - successfully started logstash api endpoint { :port=>9600} 1.1.1.1 get /index.html 30 0.23 我们标准输入一些日志信息。 { "@timestamp" => 2020-10-13t06:30:11.973z, "host" => "node1", "@version" => "1", "request" => "/index.html", "message" => "1.1.1.1 get /index.html 30 0.23", "duration" => "0.23", "clientip" => "1.1.1.1", "method" => "get", "bytes" => "30" }

实例三:将一些webserver服务器产生的日志进行过滤标准输出

例如:apache产生的日志,在grok中有特定的过滤apache日志的结构。
[root@node1 conf.d]# vim httpdsimple.conf
input {
  
   file {
       从哪个文件中获取
   path => ["/var/log/httpd/access_log"]   文件路径
   type => "apachelog"    文件类型
   start_position => "beginning"    从最开始取数据
   }
}
filter {
  
   grok {
      过滤分析格式
   match => {
  "message" => "%{combinedapachelog}"}  过滤httpd日志格式。
   }
}
output {
  
   stdout {
  
     codec => rubydebug
  }
}
[root@node4 conf.d]# logstash -f /etc/logstash/conf.d/httpd.conf --path.data=/tmp
warning: could not find logstash.yml which is typically located in $ls_home/config or /etc/logstash. you can specify the path using --path.settings. continuing using the defaults
could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. using default config which logs errors to the console
[info ] 2020-10-13 17:57:53.743 [main] runner - starting logstash {
  "logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc openjdk 64-bit server vm 25.262-b10 on 1.8.0_262-b10  indy  jit [linux-x86_64]"}
[info ] 2020-10-13 17:57:53.803 [main] writabledirectory - creating directory {
  :setting=>"path.queue", :path=>"/tmp/queue"}
[info ] 2020-10-13 17:57:53.819 [main] writabledirectory - creating directory {
  :setting=>"path.dead_letter_queue", :path=>"/tmp/dead_letter_queue"}
[warn ] 2020-10-13 17:57:54.315 [logstash::runner] multilocal - ignoring the 'pipelines.yml' file because modules or command line options are specified
[info ] 2020-10-13 17:57:54.352 [logstash::runner] agent - no persistent uuid file found. generating new uuid {
  :uuid=>"99ab593e-1436-49c0-874a-e815644cb316", :path=>"/tmp/uuid"}
[info ] 2020-10-13 17:57:56.722 [converge pipelineaction::create
] reflections - reflections took 51 ms to scan 1 urls, producing 22 keys and 45 values [info ] 2020-10-13 17:57:58.642 [[main]-pipeline-manager] javapipeline - starting pipeline { :pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/httpd.conf"], :thread=>"#"} [info ] 2020-10-13 17:57:59.663 [[main]-pipeline-manager] javapipeline - pipeline java execution initialization time { "seconds"=>1.01} [info ] 2020-10-13 17:58:00.166 [[main]-pipeline-manager] file - no sincedb_path set, generating one based on the "path" setting { :sincedb_path=>"/tmp/plugins/inputs/file/.sincedb_15940cad53dd1d99808eeaecd6f6ad3f", :path=>["/var/log/httpd/access_log"]} [info ] 2020-10-13 17:58:00.198 [[main]-pipeline-manager] javapipeline - pipeline started { "pipeline.id"=>"main"} [info ] 2020-10-13 17:58:00.316 [agent thread] agent - pipelines running { :count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} [info ] 2020-10-13 17:58:00.356 [[main]9603} 在浏览器直接访问10.5.100.183 { "@timestamp" => 2020-10-13t10:01:02.347z, "message" => "- - - [13/oct/2020:18:01:01 0800] \"get / http/1.1\" 304 - \"-\" \"mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/77.0.3865.90 safari/537.36\"", "@version" => "1", "tags" => [ [0] "_grokparsefailure" ], "path" => "/var/log/httpd/access_log", "type" => "apachelog", "host" => "node4" } { "@timestamp" => 2020-10-13t10:01:02.407z, "message" => "- - - [13/oct/2020:18:01:01 0800] \"get /favicon.ico http/1.1\" 404 209 \"http://10.5.100.183/\" \"mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/77.0.3865.90 safari/537.36\"", "@version" => "1", "tags" => [ [0] "_grokparsefailure" ], "path" => "/var/log/httpd/access_log", "type" => "apachelog", "host" => "node4" }

logstash的工作流程我们完整的走了一遍,从input—>filter—>output依次进行。更多内容在下篇如何实现
读取httpd日志—>存储在redis—>读取redis数据—>存储在elasticsearch中。

网站地图