logstash之grok(代码片段)

终点即起点 终点即起点     2023-05-09     132

关键词:

nginx匹配示例

nginx日志格式
\'$remote_user [$time_local]  $http_x_Forwarded_for $remote_addr  $request $status $upstream_status\'
                       \'$http_x_forwarded_for\'
                       \'$upstream_addr \'
                       \'ups_resp_time: $upstream_response_time \'
                       \'request_time: $request_time\';
nginx日志示例
- [09/May/2023:15:01:31 +0800]  11.20.1.30 38.34.246.127  GET / HTTP/1.1 200 -11.20.1.30- ups_resp_time: - request_time: 0.000
grok匹配
filter 
   grok 
       match => 
         "message" => "%DATA:remote_user \\[%HTTPDATE:log_times\\]  %IPV4:http_x_Forwarded_for %IPV4:remote_addr  %WORD:request_method %DATA:uri HTTP/%NUMBER:http_version %NUMBER:response_code %DATA:upstream_status%IPV4:http_x_forwarded_for%DATA:upstream_addr ups_resp_time: %DATA:ups_resp_time request_time: %NUMBER:request_time"
        
   
匹配后数据

    "http_x_Forwarded_for" => "11.20.1.30",
                    "host" => "elk3",
                 "message" => "- [09/May/2023:15:01:31 +0800]  11.20.1.30 38.34.246.127  GET / HTTP/1.1 200 -11.20.1.30- ups_resp_time: - request_time: 0.000",
          "request_method" => "GET",
         "upstream_status" => "-",
           "ups_resp_time" => "-",
            "request_time" => "0.000",
             "remote_user" => "-",
               "log_times" => "09/May/2023:15:01:31 +0800",
           "upstream_addr" => "-",
                "@version" => "1",
              "@timestamp" => 2023-05-09T08:12:35.912Z,
            "http_version" => "1.1",
             "remote_addr" => "38.34.246.127",
    "http_x_forwarded_for" => "11.20.1.30",
                     "uri" => "/",
           "response_code" => "200"

 

grok使用格式

%SYNTAX:SEMANTIC
%预定义好的表达式的名字:自定义命名

内置正则

 USERNAME [a-zA-Z0-9._-]+
 USER %USERNAME
 EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
 EMAILADDRESS %EMAILLOCALPART@%HOSTNAME
 INT (?:[+-]?(?:[0-9]+))
 BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\\.[0-9]+)?)|(?:\\.[0-9]+)))
 NUMBER (?:%BASE10NUM)
 BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))
 BASE16FLOAT \\b(?<![0-9A-Fa-f.])(?:[+-]?(?:0x)?(?:(?:[0-9A-Fa-f]+(?:\\.[0-9A-Fa-f]*)?)|(?:\\.[0-9A-Fa-f]+)))\\b
 
 POSINT \\b(?:[1-9][0-9]*)\\b
 NONNEGINT \\b(?:[0-9]+)\\b
 WORD \\b\\w+\\b
 NOTSPACE \\S+
 SPACE \\s*
 DATA .*?
 GREEDYDATA .*
 QUOTEDSTRING (?>(?<!\\\\)(?>"(?>\\\\.|[^\\\\"]+)+"|""|(?>\'(?>\\\\.|[^\\\\\']+)+\')|\'\'|(?>`(?>\\\\.|[^\\\\`]+)+`)|``))
 UUID [A-Fa-f0-9]8-(?:[A-Fa-f0-9]4-)3[A-Fa-f0-9]12
 # URN, allowing use of RFC 2141 section 2.3 reserved characters
 URN urn:[0-9A-Za-z][0-9A-Za-z-]0,31:(?:%[0-9a-fA-F]2|[0-9A-Za-z()+,.:=@;$_!*\'/?#-])+
 
 # Networking
 MAC (?:%CISCOMAC|%WINDOWSMAC|%COMMONMAC)
 CISCOMAC (?:(?:[A-Fa-f0-9]4\\.)2[A-Fa-f0-9]4)
 WINDOWSMAC (?:(?:[A-Fa-f0-9]2-)5[A-Fa-f0-9]2)
 COMMONMAC (?:(?:[A-Fa-f0-9]2:)5[A-Fa-f0-9]2)
 IPV6 ((([0-9A-Fa-f]1,4:)7([0-9A-Fa-f]1,4|:))|(([0-9A-Fa-f]1,4:)6(:[0-9A-Fa-f]1,4|((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3)|:))|(([0-9A-Fa-f]1,4:)5(((:[0-9A-Fa-f]1,4)1,2)|:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3)|:))|(([0-9A-Fa-f]1,4:)4(((:[0-9A-Fa-f]1,4)1,3)|((:[0-9A-Fa-f]1,4)?:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3))|:))|(([0-9A-Fa-f]1,4:)3(((:[0-9A-Fa-f]1,4)1,4)|((:[0-9A-Fa-f]1,4)0,2:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3))|:))|(([0-9A-Fa-f]1,4:)2(((:[0-9A-Fa-f]1,4)1,5)|((:[0-9A-Fa-f]1,4)0,3:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3))|:))|(([0-9A-Fa-f]1,4:)1(((:[0-9A-Fa-f]1,4)1,6)|((:[0-9A-Fa-f]1,4)0,4:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3))|:))|(:(((:[0-9A-Fa-f]1,4)1,7)|((:[0-9A-Fa-f]1,4)0,5:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d))3))|:)))(%.+)?
 IPV4 (?<![0-9])(?:(?:[0-1]?[0-9]1,2|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]1,2|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]1,2|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]1,2|2[0-4][0-9]|25[0-5]))(?![0-9])
 IP (?:%IPV6|%IPV4)
 HOSTNAME \\b(?:[0-9A-Za-z][0-9A-Za-z-]0,62)(?:\\.(?:[0-9A-Za-z][0-9A-Za-z-]0,62))*(\\.?|\\b)
 IPORHOST (?:%IP|%HOSTNAME)
 HOSTPORT %IPORHOST:%POSINT
 
 # paths
 PATH (?:%UNIXPATH|%WINPATH)
 UNIXPATH (/([\\w_%!$@:.,+~-]+|\\\\.)*)+
 TTY (?:/dev/(pts|tty([pq])?)(\\w+)?/?(?:[0-9]+))
 WINPATH (?>[A-Za-z]+:|\\\\)(?:\\\\[^\\\\?*]*)+
 URIPROTO [A-Za-z]([A-Za-z0-9+\\-.]+)+
 URIHOST %IPORHOST(?::%POSINT:port)?
 # uripath comes loosely from RFC1738, but mostly from what Firefox
 # doesn\'t turn into %XX
 URIPATH (?:/[A-Za-z0-9$.+!*\'(),~:;=@#%&_\\-]*)+
 #URIPARAM \\?(?:[A-Za-z0-9]+(?:=(?:[^&]*))?(?:&(?:[A-Za-z0-9]+(?:=(?:[^&]*))?)?)*)?
 URIPARAM \\?[A-Za-z0-9$.+!*\'|(),~@#%&/=:;_?\\-\\[\\]<>]*
 URIPATHPARAM %URIPATH(?:%URIPARAM)?
 URI %URIPROTO://(?:%USER(?::[^@]*)?@)?(?:%URIHOST)?(?:%URIPATHPARAM)?
 
 # Months: January, Feb, 3, 03, 12, December
 MONTH \\b(?:[Jj]an(?:uary|uar)?|[Ff]eb(?:ruary|ruar)?|[Mm](?:a|ä)?r(?:ch|z)?|[Aa]pr(?:il)?|[Mm]a(?:y|i)?|[Jj]un(?:e|i)?|[Jj]ul(?:y)?|[Aa]ug(?:ust)?|[Ss]ep(?:tember)?|[Oo](?:c|k)?t(?:ober)?|[Nn]ov(?:ember)?|[Dd]e(?:c|z)(?:ember)?)\\b
 MONTHNUM (?:0?[1-9]|1[0-2])
 MONTHNUM2 (?:0[1-9]|1[0-2])
 MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
 
 # Days: Monday, Tue, Thu, etc...
 DAY (?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)
 
 # Years?
 YEAR (?>\\d\\d)1,2
 HOUR (?:2[0123]|[01]?[0-9])
 MINUTE (?:[0-5][0-9])
 # \'60\' is a leap second in most time standards and thus is valid.
 SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
 TIME (?!<[0-9])%HOUR:%MINUTE(?::%SECOND)(?![0-9])
 # datestamp is YYYY/MM/DD-HH:MM:SS.UUUU (or something like it)
 DATE_US %MONTHNUM[/-]%MONTHDAY[/-]%YEAR
 DATE_EU %MONTHDAY[./-]%MONTHNUM[./-]%YEAR
 ISO8601_TIMEZONE (?:Z|[+-]%HOUR(?::?%MINUTE))
 ISO8601_SECOND (?:%SECOND|60)
 TIMESTAMP_ISO8601 %YEAR-%MONTHNUM-%MONTHDAY[T ]%HOUR:?%MINUTE(?::?%SECOND)?%ISO8601_TIMEZONE?
 DATE %DATE_US|%DATE_EU
 DATESTAMP %DATE[- ]%TIME
 TZ (?:[APMCE][SD]T|UTC)
 DATESTAMP_RFC822 %DAY %MONTH %MONTHDAY %YEAR %TIME %TZ
 DATESTAMP_RFC2822 %DAY, %MONTHDAY %MONTH %YEAR %TIME %ISO8601_TIMEZONE
 DATESTAMP_OTHER %DAY %MONTH %MONTHDAY %TIME %TZ %YEAR
 DATESTAMP_EVENTLOG %YEAR%MONTHNUM2%MONTHDAY%HOUR%MINUTE%SECOND
 
 # Syslog Dates: Month Day HH:MM:SS
 SYSLOGTIMESTAMP %MONTH +%MONTHDAY %TIME
 PROG [\\x21-\\x5a\\x5c\\x5e-\\x7e]+
 SYSLOGPROG %PROG:program(?:\\[%POSINT:pid\\])?
 SYSLOGHOST %IPORHOST
 SYSLOGFACILITY <%NONNEGINT:facility.%NONNEGINT:priority>
 HTTPDATE %MONTHDAY/%MONTH/%YEAR:%TIME %INT
 
 # Shortcuts
 QS %QUOTEDSTRING
 
 # Log formats
 SYSLOGBASE %SYSLOGTIMESTAMP:timestamp (?:%SYSLOGFACILITY )?%SYSLOGHOST:logsource %SYSLOGPROG:
 
 # Log Levels
 LOGLEVEL ([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)

 

logstash:grok模式示例(代码片段)

Logstash可以轻松解析CSV和JSON文件,因为这些格式的数据组织得很好,可以进行Elasticsearch分析。但是,有时我们需要处理非结构化数据,例如纯文本日志。在这些情况下,我们需要使用LogstashGrok或其他第三方服... 查看详情

logstash:日志解析的grok模式示例(代码片段)

如果没有日志解析,搜索和可视化日志几乎是不可能的,一个被低估的技能记录器需要读取他们的数据。解析结构化你的传入(非结构化)日志,以便用户可以在调查期间或设置仪表板时搜索清晰的字段和值... 查看详情

logstash:日志解析的grok模式示例(代码片段)

...和分析工具中解析日志数据。在这里查看我的Grok教程“Logstash:Grokfilter入门”。但是用Grok解析日志可能会很棘手。本博客将研究一些Grok模式示例,这些示例可以帮助你了解如何解析日志数据。什么是grok?最初的术语实际... 查看详情

elk日志分析平台之logstash数据采集(代码片段)

目录logstash简介数据采集三要素:输入,过滤和输出一Logstash安装与配置二Logstash的输入输出1命令行方式:标准输入到标准输出2conf文件方式:标准输入,输出到文件3conf文件方式:标准输入,输出到ES和... 查看详情

elk日志分析平台之logstash数据采集(代码片段)

目录logstash简介数据采集三要素:输入,过滤和输出一Logstash安装与配置二Logstash的输入输出1命令行方式:标准输入到标准输出2conf文件方式:标准输入,输出到文件3conf文件方式:标准输入,输出到ES和... 查看详情

elk日志处理之使用grok解析日志

...。Grok内置了120多种的正则表达式库,地址:https://github.com/logstash-plugins/logstash-patterns-core/tree/master/ 查看详情

elk之logstash(代码片段)

1、logstash简介?logstash是一个数据分析软件,主要目的是分析log日志。整一套软件可以当作一个MVC模型,logstash是controller层,Elasticsearch是一个model层,kibana是view层。首先将数据传给logstash,它将数据进行过滤和格式化(转成JSON格... 查看详情

logstash:使用自定义正则表达式模式(代码片段)

有时LogstashGrok没有我们需要的模式。幸运的是我们有正则表达式库:Oniguruma。在很多时候,如果Logstash所提供的正则表达不能满足我们的需求,我们选用定制自己的表达式。定义Logstash是一种服务器端数据处理管道ÿ... 查看详情

grok内置的默认类型有很多种,查看所有默认类型(代码片段)

源地址:(https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns)USERNAME[a-zA-Z0-9._-]+USER%USERNAMEEMAILLOCALPART[a-zA-Z][a-zA-Z0-9_.+-=:]+EMAILADDRESS%EMAILLOCA 查看详情

logstash:grok模式示例(代码片段)

Logstash可以轻松解析CSV和JSON文件,因为这些格式的数据组织得很好,可以进行Elasticsearch分析。但是,有时我们需要处理非结构化数据,例如纯文本日志。在这些情况下,我们需要使用LogstashGrok或其他第三方服... 查看详情

使用logstashfiltergrok过滤日志文件(代码片段)

Logstash提供了一系列filter过滤plugin来处理收集到的logevent,根据logevent的特征去切分所需要的字段,方便kibana做visualize和dashboard的dataanalysis。所有logstash支持的event切分插件查看这里。下面我们主要讲grok切分。Grok基本介绍Grok使用... 查看详情

grok自定义模式(代码片段)

我的logstash中有简单的消息:2018-09-3020:25:07.708INFO8013---[nio-8443-exec-3]c.e.demo.controller.UsuarioController:INICIOCHAMADA|311我想从此消息中删除以下字段。"311"答案你快到了。你只需要用反斜杠逃避管道|字符(否则它匹配801之后的INFO),如下... 查看详情

Grok 用于日志文件 Logstash

】Grok用于日志文件Logstash【英文标题】:GrokforlogfilesLogstash【发布时间】:2020-04-2212:03:00【问题描述】:我需要编写一个grok模式来检索“****”中的内容-----Startofscriptforserversitboap1at**FriApr1714:24:19**HKT2020---------**user11**8775110Apr16?00:00... 查看详情

logstash / grok 自定义字段

】logstash/grok自定义字段【英文标题】:logstash/grokcustomfileds【发布时间】:2017-07-1810:37:32【问题描述】:我是ELK堆栈的全新用户。我在从日志中过滤掉特定部分时遇到了一点问题。示例日志:[2017-05-3013:58:09,336]INFO[com.qwerty.test.core... 查看详情

logstash grok 模式来监控 logstash 本身

】logstashgrok模式来监控logstash本身【英文标题】:logstashgrokpatterntomonitorlogstashitself【发布时间】:2016-05-0316:07:48【问题描述】:我想将logstash.log日志添加到我的ELK堆栈中,但我总是遇到grokparsefailure。我的模式在http://grokconstructor.a... 查看详情

logstash / grok 模式文件

】logstash/grok模式文件【英文标题】:logstash/grokpatternfile【发布时间】:2015-09-2902:24:38【问题描述】:我正在解析IIS日志,当所有模式都在配置文件中时,我一切正常。我想取出所有模式并将它们放在一个模式文件中,但似乎无... 查看详情

elk日志分析平台之logstash数据采集(代码片段)

目录logstash简介数据采集三要素:输入,过滤和输出一Logstash安装与配置二Logstash的输入输出1命令行方式:标准输入到标准输出2conf文件方式:标准输入,输出到文件3conf文件方式:标准输入,输出到ES和... 查看详情

Logstash - grok 使用消息以外的字段

】Logstash-grok使用消息以外的字段【英文标题】:Logstash-grokuseafieldotherthanmessage【发布时间】:2014-09-1505:55:01【问题描述】:我正在使用Logstash转发器从远程服务器接收Log4j生成的日志文件。日志事件的字段包括一个名为“文件”... 查看详情