日日操夜夜添-日日操影院-日日草夜夜操-日日干干-精品一区二区三区波多野结衣-精品一区二区三区高清免费不卡

公告:魔扣目錄網為廣大站長提供免費收錄網站服務,提交前請做好本站友鏈:【 網站目錄:http://www.ylptlb.cn 】, 免友鏈快審服務(50元/站),

點擊這里在線咨詢客服
新站提交
  • 網站:51998
  • 待審:31
  • 小程序:12
  • 文章:1030137
  • 會員:747

目錄
  • 目標
  • 配置
    • 告警內容顯示資源名稱
    • 屏蔽特定的節點和工作負載
  • 最終效果

    目標

    上一篇文章

    IoT 邊緣集群基于 Kubernetes Events 的告警通知實現

    告警恢復通知 – 經過評估無法實現

    原因: 告警和恢復是單獨完全不相關的事件, 告警是 Warning 級別, 恢復是 Normal 級別, 要開啟恢復, 就會導致所有 Normal Events 都會被發送, 這個數量是很恐怖的; 而且, 除非特別有經驗和耐心, 否則無法看出哪條 Normal 對應的是 告警的恢復.

    • 未恢復進行持續告警 – 默認就帶的能力, 無需額外配置.
    • 告警內容顯示資源名稱,比如節點和pod名稱

    可以設置屏蔽特定的節點和工作負載并可以動態調整

    比如,集群001中的節點worker-1做計劃性維護,期間停止監控,維護完成后重新開始監控。

    配置

    告警內容顯示資源名稱

    典型的幾類 events:

    apiVersion: v1
    count: 101557
    eventTime: null
    firstTimestamp: "2022-04-08T03:50:47Z"
    involvedObject:
      apiVersion: v1
      fieldPath: spec.containers{prometheus}
      kind: Pod
      name: prometheus-rancher-monitoring-prometheus-0
      namespace: cattle-monitoring-system
    kind: Event
    lastTimestamp: "2022-04-14T11:39:19Z"
    message: 'Readiness probe failed: Get "http://10.42.0.87:9090/-/ready": context deadline
      exceeded (Client.Timeout exceeded while awaiting headers)'
    metadata:
      creationTimestamp: "2022-04-08T03:51:17Z"
      name: prometheus-rancher-monitoring-prometheus-0.16e3cf53f0793344
      namespace: cattle-monitoring-system
    reason: Unhealthy
    reportingComponent: ""
    reportingInstance: ""
    source:
      component: kubelet
      host: master-1
    type: Warning
    
    apiVersion: v1
    count: 116
    eventTime: null
    firstTimestamp: "2022-04-13T02:43:26Z"
    involvedObject:
      apiVersion: v1
      fieldPath: spec.containers{grafana}
      kind: Pod
      name: rancher-monitoring-grafana-57777cc795-2b2x5
      namespace: cattle-monitoring-system
    kind: Event
    lastTimestamp: "2022-04-14T11:18:56Z"
    message: 'Readiness probe failed: Get "http://10.42.0.90:3000/api/health": context
      deadline exceeded (Client.Timeout exceeded while awaiting headers)'
    metadata:
      creationTimestamp: "2022-04-14T11:18:57Z"
      name: rancher-monitoring-grafana-57777cc795-2b2x5.16e5548dd2523a13
      namespace: cattle-monitoring-system
    reason: Unhealthy
    reportingComponent: ""
    reportingInstance: ""
    source:
      component: kubelet
      host: master-1
    type: Warning
    
    apiVersion: v1
    count: 20958
    eventTime: null
    firstTimestamp: "2022-04-11T10:34:51Z"
    involvedObject:
      apiVersion: v1
      fieldPath: spec.containers{lb-port-1883}
      kind: Pod
      name: svclb-emqx-dt22t
      namespace: emqx
    kind: Event
    lastTimestamp: "2022-04-14T11:39:48Z"
    message: Back-off restarting failed container
    metadata:
      creationTimestamp: "2022-04-11T10:34:51Z"
      name: svclb-emqx-dt22t.16e4d11e2b9efd27
      namespace: emqx
    reason: BackOff
    reportingComponent: ""
    reportingInstance: ""
    source:
      component: kubelet
      host: worker-1
    type: Warning
    
    apiVersion: v1
    count: 21069
    eventTime: null
    firstTimestamp: "2022-04-11T10:34:48Z"
    involvedObject:
      apiVersion: v1
      fieldPath: spec.containers{lb-port-80}
      kind: Pod
      name: svclb-traefik-r5p8t
      namespace: kube-system
    kind: Event
    lastTimestamp: "2022-04-14T11:44:59Z"
    message: Back-off restarting failed container
    metadata:
      creationTimestamp: "2022-04-11T10:34:48Z"
      name: svclb-traefik-r5p8t.16e4d11daf0b79ce
      namespace: kube-system
    reason: BackOff
    reportingComponent: ""
    reportingInstance: ""
    source:
      component: kubelet
      host: worker-1
    type: Warning
    
    {
      "metadata": {
        "name": "event-exporter-79544df9f7-xj4t5.16e5c540dc32614f",
        "namespace": "monitoring",
        "uid": "baf2f642-2383-4e22-87e0-456b6c3eaf4e",
        "resourceVersion": "14043444",
        "creationTimestamp": "2022-04-14T13:08:40Z"
      },
      "reason": "Pulled",
      "message": "Container image \"ghcr.io/opsgenie/kubernetes-event-exporter:v0.11\" already present on machine",
      "source": {
        "component": "kubelet",
        "host": "worker-2"
      },
      "firstTimestamp": "2022-04-14T13:08:40Z",
      "lastTimestamp": "2022-04-14T13:08:40Z",
      "count": 1,
      "type": "Normal",
      "eventTime": null,
      "reportingComponent": "",
      "reportingInstance": "",
      "involvedObject": {
        "kind": "Pod",
        "namespace": "monitoring",
        "name": "event-exporter-79544df9f7-xj4t5",
        "uid": "b77d3e13-fa9e-484b-8a5a-d1afc9edec75",
        "apiVersion": "v1",
        "resourceVersion": "14043435",
        "fieldPath": "spec.containers{event-exporter}",
        "labels": {
          "app": "event-exporter",
          "pod-template-hash": "79544df9f7",
          "version": "v1"
        }
      }
    }
    

    我們可以把更多的字段加入到告警信息中, 其中就包括:

    • 節點: {{ Source.Host }}
    • Pod: {{ .InvolvedObject.Name }}

    綜上, 修改后的event-exporter-cfg yaml 如下:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: event-exporter-cfg
      namespace: monitoring
      resourceVersion: '5779968'
    data:
      config.yaml: |
        logLevel: error
        logFormat: json
        route:
          routes:
            - match:
                - receiver: "dump"      
            - drop:
                - type: "Normal"
              match:
                - receiver: "feishu"                     
        receivers:
          - name: "dump"
            stdout: {}
          - name: "feishu"
            webhook:
              endpoint: "https://open.feishu.cn/open-apis/bot/v2/hook/..."
              headers:
                Content-Type: application/json
              layout:
                msg_type: interactive
                card:
                  config:
                    wide_screen_mode: true
                    enable_forward: true
                  header:
                    title:
                      tag: plain_text
                      content: xxx測試K3S集群告警
                    template: red
                  elements:
                    - tag: div
                      text: 
                        tag: lark_md
                        content: "**EventID:**  {{ .UID }}\n**EventNamespace:**  {{ .InvolvedObject.Namespace }}\n**EventName:**  {{ .InvolvedObject.Name }}\n**EventType:**  {{ .Type }}\n**EventKind:**  {{ .InvolvedObject.Kind }}\n**EventReason:**  {{ .Reason }}\n**EventTime:**  {{ .LastTimestamp }}\n**EventMessage:**  {{ .Message }}\n**EventComponent:**  {{ .Source.Component }}\n**EventHost:**  {{ .Source.Host }}\n**EventLabels:**  {{ toJson .InvolvedObject.Labels}}\n**EventAnnotations:**  {{ toJson .InvolvedObject.Annotations}}"
    

    屏蔽特定的節點和工作負載

    比如,集群001中的節點worker-1做計劃性維護,期間停止監控,維護完成后重新開始監控。

    繼續修改event-exporter-cfg yaml 如下:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: event-exporter-cfg
      namespace: monitoring
    data:
      config.yaml: |
        logLevel: error
        logFormat: json
        route:
          routes:
            - match:
                - receiver: "dump"      
            - drop:
                - type: "Normal"
                - source:
                    host: "worker-1"
                - namespace: "cattle-monitoring-system"
                - name: "*emqx*"
                - kind: "Pod|Deployment|ReplicaSet"
                - labels:
                    version: "dev"
              match:
                - receiver: "feishu"                     
        receivers:
          - name: "dump"
            stdout: {}
          - name: "feishu"
            webhook:
              endpoint: "https://open.feishu.cn/open-apis/bot/v2/hook/..."
              headers:
                Content-Type: application/json
              layout:
                msg_type: interactive
                card:
                  config:
                    wide_screen_mode: true
                    enable_forward: true
                  header:
                    title:
                      tag: plain_text
                      content: xxx測試K3S集群告警
                    template: red
                  elements:
                    - tag: div
                      text: 
                        tag: lark_md
                        content: "**EventID:**  {{ .UID }}\n**EventNamespace:**  {{ .InvolvedObject.Namespace }}\n**EventName:**  {{ .InvolvedObject.Name }}\n**EventType:**  {{ .Type }}\n**EventKind:**  {{ .InvolvedObject.Kind }}\n**EventReason:**  {{ .Reason }}\n**EventTime:**  {{ .LastTimestamp }}\n**EventMessage:**  {{ .Message }}\n**EventComponent:**  {{ .Source.Component }}\n**EventHost:**  {{ .Source.Host }}\n**EventLabels:**  {{ toJson .InvolvedObject.Labels}}\n**EventAnnotations:**  {{ toJson .InvolvedObject.Annotations}}"
    

    默認的 drop 規則為: - type: "Normal", 即不對 Normal 級別進行告警;

    現在加入以下規則:

                - source:
                    host: "worker-1"
                - namespace: "cattle-monitoring-system"
                - name: "*emqx*"
                - kind: "Pod|Deployment|ReplicaSet"
                - labels:
                    version: "dev"
    
    • ... host: "worker-1": 不對節點worker-1 做告警;
    • ... namespace: "cattle-monitoring-system": 不對 NameSpace: cattle-monitoring-system 做告警;
    • ... name: "*emqx*": 不對 name(name 往往是 pod name) 包含 emqx 的做告警
    • kind: "Pod|Deployment|ReplicaSet": 不對 Pod Deployment ReplicaSet 做告警(也就是不關注應用, 組件相關的告警)
    • ...version: "dev": 不對 label 含有 version: "dev" 的做告警(可以通過它屏蔽特定的應用的告警)

    最終效果

    如下圖:

    IoT?邊緣集群Kubernetes?Events告警通知進一步配置詳解

    IoT?邊緣集群Kubernetes?Events告警通知進一步配置詳解

    以上就是IoT 邊緣集群Kubernetes Events告警通知進一步配置詳解的詳細內容,更多關于IoT Kubernetes Events告警的資料請關注其它相關文章!

    分享到:
    標簽:告警 詳解 邊緣 配置 集群
    用戶無頭像

    網友整理

    注冊時間:

    網站:5 個   小程序:0 個  文章:12 篇

    • 51998

      網站

    • 12

      小程序

    • 1030137

      文章

    • 747

      會員

    趕快注冊賬號,推廣您的網站吧!
    最新入駐小程序

    數獨大挑戰2018-06-03

    數獨一種數學游戲,玩家需要根據9

    答題星2018-06-03

    您可以通過答題星輕松地創建試卷

    全階人生考試2018-06-03

    各種考試題,題庫,初中,高中,大學四六

    運動步數有氧達人2018-06-03

    記錄運動步數,積累氧氣值。還可偷

    每日養生app2018-06-03

    每日養生,天天健康

    體育訓練成績評定2018-06-03

    通用課目體育訓練成績評定