正文

前端错误监控的简单设计与实现(代码片段)

c.  c.  2023-03-18  192

关键词：

文章目录

前端错误监控的简单设计与实现

在之前的博文中讲了在前端React的错误监控系统要如何设计《React 错误处理和日志记录的思考与设计》

这篇博文主要讲一下根据如下的业务场景并结合《React 错误处理和日志记录的思考与设计》，如何设计一个简单的前端错误监控功能。

首先业务场景比较简单，只是为了让我们的开发人员能够发现用户在前端操作出现的一些前端错误，能够尽早发现和定位问题。我们暂定是使用邮件的形式来通知我们的开发人员。而且我们并不要求所有的前端错误都能够实时全量的通知给开发人员，因为当前端有问题的时候，可能前端报错特别多，会导致上报的数据会很多，从而造成发送很多邮件，而实际上我们只是想关心发生了什么错误，而不是关心发生了多少错误。所以我们会对监控上报和邮件通知进行限制，保证不会有瞬间过多的监控数据请求到后端。

最后要强调的是，本篇博文的设计只是针对某些业务场景进行设计的，并不适用于中大型的系统，也不适用于专门的前端监控系统，也不建议直接照搬到生产环境。如果你对大厂前端监控怎么设计和埋点的，可以参考文章最下方的链接，这里就不过多的赘述了。

前端埋点

在《React 错误处理和日志记录的思考与设计》中讲述了几种前端异常捕获的方式。我们这里主要采用的是windows对象中的事件监听器，使用window.addEventLinstener去注册事件监听器。
我们主要关心的其实只有两种事件window.addEventListener('error', ....)和window.addEventListener('unhandledrejection',...)

其中这里很多小伙伴有疑问，为什么不用 window.onerror 全局监听呢? 到底window.addEventLinstener('error') 和 window.onerror 有什么区别呢?

我们可以从MDN网站中看到更加推荐使用addEventListener()的方式

Note: The addEventListener() method is the recommended way to register an event listener. The benefits are as follows:

It allows adding more than one handler for an event. This is particularly useful for libraries, JavaScript modules, or any other kind of code that needs to work well with other libraries or extensions.
In contrast to using an onXYZ property, it gives you finer-grained control of the phase when the listener is activated (capturing vs. bubbling).
It works on any event target, not just HTML or SVG elements.

参考：https://developer.mozilla.org/zh-CN/docs/Web/API/EventTarget/addEventListener

首先window.onerror和window.addEventListener('error', ....)这两个函数功能基本一致，都可以全局捕获 js 异常。但是有一类异常叫做资源加载异常，就是在代码中引用了不存在的图片，js，css 等静态资源导致的异常，比如：

const loadCss = ()=> 
  let link = document.createElement('link')
  link.type = 'text/css'
  link.rel = 'stylesheet'
  link.href = 'https://baidu.com/15.css'
  document.getElementsByTagName('head')[10].append(link)

render() 
  return <div>
    <img src='./bbb.png'/>
    <button onClick=loadCss>加载样式<button/>
  </div>

上述代码中的 baidu.com/15.css 和 bbb.png 是不存在的，JS 执行到这里肯定会报一个资源找不到的错误。但是默认情况下，上面两种 window 对象上的全局监听函数都监听不到这类异常。

因为资源加载的异常只会在当前元素触发，异常不会冒泡到 window，因此监听 window 上的异常是捕捉不到的。那怎么办呢？

如果你熟悉 DOM 事件你就会明白，既然冒泡阶段监听不到，那么在捕获阶段一定能监听到。

方法就是给 window.addEventListene 函数指定第三个参数，很简单就是 true，表示该监听函数会在捕获阶段执行，这样就能监听到资源加载异常了。

// 捕获阶段全局监听
window.addEventListene(
  'error',
  (error) => 
    if (error.target != window) 
      console.log(error.target.tagName, error.target.src);
    
    handleError(error);
  ,
  true,
);

上述方式可以很轻松的监听到图片加载异常，这就是为什么更推荐 window.addEventListene 的原因。不过要记得，第三个参数设为 true，监听事件捕获，就可以全局捕获到 JS 异常和资源加载异常。

接下来就是window.addEventListener('unhandledrejection',...)了，我们可以从MDN网站中看到unhandledrejection事件的功能如下：

The unhandledrejection event is sent to the global scope of a script when a JavaScript Promise that has no rejection handler is rejected; typically, this is the window, but may also be a Worker.

This is useful for debugging and for providing fallback error handling for unexpected situations.

参考：https://developer.mozilla.org/en-US/docs/Web/API/Window/unhandledrejection_event

因为window.addEventListener('error', ....)不能捕获 Promise 异常。不管是 Promise.then() 写法还是 async/await 写法，发生异常时都不能捕获。所以我们才需要全局监听一个 unhandledrejection 事件来捕获未处理的 Promise 异常。unhandledrejection 事件会在 Promise 发生异常并且没有指定 catch 的时候触发，这个函数会捕捉到运行时意外发生的 Promise 异常，这对我们异常监控非常有用。

而且我们请求后端API也是使用的Promise，所以可以不使用类似umi-request中错误拦截器来捕获异常，直接监听unhandledrejection事件也能捕获到这类的异常。

错误类型

上面我们在埋点的时候讲到，我们会注册两个事件监听器，window.addEventListener('error', ....,true)和window.addEventListener('unhandledrejection',...)。这两个事件监听器，其实最终可能产生4种类型的错误。

首先还是window.addEventListener('error', ...., true)会产生两种类型的错误，一种是代码中的错误，我们称之为ScriptError。还有一种错误就是第三个参数true的作用，也就是可能是静态资源导致的错误，我们称之为ResourceError

而对于window.addEventListener('unhandledrejection',...)的错误也分为两种，一种是我们前面说到的请求后端API的错误，我们称之为ResponseError。还有一种是其他的Promise 异常，我们称之为PromiseError。

所以一共分成4种类型的错误：

ScriptError
ResourceError
ResponseError
PromiseError

异常上报的数据格式

根据上面不同的错误类型，最终上报的数据格式可能也是不一样的，所以我们定义了如下的格式。

首先公共上报数据格式如下：

export type ErrorMonitorInfo = 
  domain: string,
  referrer: string,
  openURL: string,
  pageTitle: string,
  language: string,
  userAgent: string,
  currentUserName: string,
  errorType: string,
  errorDateTimeWithGMT8: string,
  error: ErrorContent,
  type: string

其中ErrorContent中分为四种错误类型

export type ErrorContent = 
  resourceError: ResourceError,
  promiseError: PromiseError,
  responseError: ResponseError,
  scriptError: ScriptError

每一种错误类型有对应的数据格式：

export type ResourceError = 
  resourceErrorDOM: string


export type PromiseError = 
  message: string


export type ResponseError = 
  message: string,
  data: string,
  request: string,
  errorStack: string


export type ScriptError = 
  filename: string,
  message: string,
  errorStack: string,

异常上报防抖处理

在之前的博文中《React 错误处理和日志记录的思考与设计》讲到了几个存在的问题，如果一次操作有很多个重复的错误，所以可能会出现多次重复请求的情况。举个例子，我们在渲染表格数据的时候，如果column的render方法有问题的话，那在渲染表格的时候可能就会触发很多次相同的错误，并且都会被我们的错误事件监听器捕获到。所以我们考虑需要对前端异常上报的功能做一个速率的限制。

比如可以考虑对上报的API做Promise并发控制，限制并发数，可以参考如下的文章：

而我们这里采用的是另一种比较简单有效的方式，就是JS防抖处理，这种防抖处理在很多功能上都会有应用。那什么叫做防抖？, 防抖就是将一组例如按下按键这种密集的事件归并成一个单独事件。举例来说，比如要搜索某个字符串，基于性能考虑，肯定不能用户每输入一个字符就发送一次搜索请求，一种方法就是等待用户停止输入，比如过了500ms用户都没有再输入，那么就搜索此时的字符串，这就是防抖。还有另外一种叫做节流，这里就不过多赘述，可以参考如下的文章：

JS中的防抖

js防抖和节流的实现

手撕源码系列 —— lodash 的 debounce 与 throttle

JavaScript 闭包

所以我们采用的是lodash中的debounce方法来进行防抖处理。

缓存异常上报数据来限制上报频率

为啥要做异常上报的缓存呢，其实目的也是为了不上报太多相同类型的错误数据。因为前面的防抖处理，只能处理短暂时间内大量的异常了触发错误监听器。但如果用户在当前页面停留一段时间，再次操作时候还是遇到一样的错误，我们其实没必要上报和通知，我们会根据异常类型和数据缓存下来，之后遇到同样的错误我们就忽略不进行上报了。

所以我们提供了一个缓存工具类

import ExpiresCache from "@/util/ExpiresCache";

class ExpiresCacheUtils 

  private static cacheMap = new Map();

  static isExpires(key: string) 
    const data = ExpiresCacheUtils.cacheMap.get(key);
    if (data == null) 
      return true;
    
    const currentTime = (new Date()).getTime()
    const expireTime = (currentTime - data.cacheTime) / 1000;
    if (Math.abs(expireTime) > data.timeout) 
      ExpiresCacheUtils.cacheMap.delete(key);
      return true;
    
    return false;
  

  static has(key: string) 
    return !ExpiresCacheUtils.isExpires(key);
  

  static delete(key: string) 
    return ExpiresCacheUtils.cacheMap.delete(key)
  

  static get(key: string) 
    const isExpires = ExpiresCacheUtils.isExpires(key)
    return isExpires ? null : ExpiresCacheUtils.cacheMap.get(key).data
  

  static set(key: string, data: any, timeout: number = 20 * 60) 
    if (key && data) 
      const expiresCache = new ExpiresCache(key, data, timeout);
      ExpiresCacheUtils.cacheMap.set(key, expiresCache)
    
  



export default ExpiresCacheUtils;

class ExpiresCache 
  private key: string;
  private data: any;
  private timeout?: number;
  private cacheTime?: number;

  constructor(key: string, data: any, timeout: number) 
    this.key = key;
    this.data = data;
    this.timeout = timeout;
    this.cacheTime = (new Date()).getTime()
  




export default ExpiresCache;

异常上报可定制化配置

我们想要在前端配置中可以指定需要开启那些错误类型的监控，或者过滤上报哪些错误类型，或者是不监控哪些指定的页面。所以我们提供了一个配置如下：

export type ErrorMonitorConfiguration = 
  ignoreScriptErrors?: RegExp[],
  ignoreDetectAllErrorForOpenPageUrls?: RegExp[],
  ignoreErrorResponseCode?: number[],
  ignoreErrorResponseUrls?: RegExp[],
  enableResourceErrorDetect?: boolean,
  enablePromiseErrorDetect?: boolean,
  enableResponseErrorDetect?: boolean,
  enableScriptErrorDetect?: boolean,
  enable?: boolean,
  triggerReportErrorIntervalMillisecond?: number,
  cacheErrorIntervalSecond?: number,
  debounceOption?: any

export default 
  ignoreDetectAllErrorForOpenPageUrls: [
    /\\/xxxx\\/xxxx-search/i, //用于忽略指定URL页面上的所有错误，使用正则表达式
  ],
  ignoreScriptErrors: [
    /ResizeObserver loop limit exceeded/i, //用于忽略错误内容包含指定字符串ResizeObserver loop limit exceeded的ScriptError, 使用正则表达式
  ],
  ignoreErrorResponseCode: [ //用于忽略API请求中响应码包含401,400的ResponseError
    401, 400
  ],
  ignoreErrorResponseUrls: [ //用于忽略API请求中URL中包含指定正则表达式的ResponseError, 默认需要指定上报接口的API路径
    /\\/xxxx\\/xxxx-front-end-monitor/i,
  ],
  enableResourceErrorDetect: true,  // 开启静态资源异常监控
  enablePromiseErrorDetect: true,  //开启Promise异常监控
  enableResponseErrorDetect: true, //开启API请求Response异常监控
  enableScriptErrorDetect: true, //开启代码中脚本异常监控
  triggerReportErrorIntervalMillisecond: 1000,  //设置JS防抖的时间，用于控制上报速率，设置1s
  cacheErrorIntervalSecond: 60,  //设置上报数据的缓存时间，用于控制上报速率,设置60s
  enable: true // 是否启用前端监控功能
 as ErrorMonitorConfiguration

前端异常监控代码

import DateUtil from "@/util/DateUtil";
import ErrorMonitorConfig from "@/config/ErrorMonitorConfig";
import debounce, isEqual from "lodash";
import ExpiresCacheUtils from "@/util/ExpiresCacheUtils";
import MonitorErrorTypeConstant from "@/constants/constants";
import type ErrorContent, ErrorMonitorInfo from "@/model/monitor";
import sendErrorMonitor from "@/services/monitor";


class ErrorMonitorUtil 

  private static readonly _timeZone = "Asia/Shanghai";
  private static readonly _reportingInterval = ErrorMonitorConfig.triggerReportErrorIntervalMillisecond || 1000;
  private static readonly _cacheInterval = ErrorMonitorConfig.cacheErrorIntervalSecond || 0;
  private static readonly _options = ErrorMonitorConfig.debounceOption || ;

  private static exportSendErrorMonitorInfo = () => 
    return (userInfo: any, error: any, callback: any) => 
      try 
        if (!ErrorMonitorConfig.enable) 
          return;
        
        const info: ErrorMonitorInfo = callback(userInfo, error);
        const ignore = (ErrorMonitorConfig.ignoreDetectAllErrorForOpenPageUrls || []).some(item => item.test(info?.openURL));
        if (ignore) 
          return;
        
        const key = `$info.type-$info.currentUserName-$info.openURL`;
        const cache = ExpiresCacheUtils.get(key);
        if (cache && isEqual(cache, info.error)) 
          return;
        
        ExpiresCacheUtils.set(key, info.error, this._cacheInterval);
        sendErrorMonitor(info).catch((e: any) => 
          console.log("send error monitor with error: ", e);
        )
       catch (e) 
        console.log("handle error monitor with error: ", e);
      
    
  

  private static constructScriptErrorMonitorInfo = (userInfo: any, error: any): ErrorMonitorInfo => 
    const info = ErrorMonitorUtil.constructCommonErrorMonitorInfo(userInfo, error?.type, MonitorErrorTypeConstant.SCRIPT_ERROR);
    info.error.scriptError = 
      filename: error?.filename || '',
      message: error?.message || '',
      errorStack: (error?.error || ).stack || ''
    
    return info;
  

  private static constructResourceErrorMonitorInfo = (userInfo: any, error: any): ErrorMonitorInfo => 
    const info = ErrorMonitorUtil.constructCommonErrorMonitorInfo(userInfo, error?.type, MonitorErrorTypeConstant.RESOURCE_ERROR);
    info.error.resourceError = 
      resourceErrorDOM: error?.target !== window ? (error?.target?.outerHTML || '') : ''
    
    return info;
  

  private static constructPromiseErrorMonitorInfo = (userInfo: any, error: any): ErrorMonitorInfo => 
    const info = ErrorMonitorUtil.constructCommonErrorMonitorInfo(userInfo, error?.type, MonitorErrorTypeConstant.PROMISE_ERROR);
    info.error.promiseError = 
      message: JSON.stringify(error)
    
    return info;
  

  private static constructResponseErrorMonitorInfo = (userInfo: any, error: any): ErrorMonitorInfo => 
    const info = ErrorMonitorUtil.constructCommonErrorMonitorInfo(userInfo, error?.type, MonitorErrorTypeConstant.HTTP_ERROR);
    info.error.responseError = 
      message: error?.message || '',
      data: JSON.stringify(error?.data) || '',
      request: JSON.stringify(error?.request) || '',
      errorStack: error?.stack || ''
    
    return info;
  

  private static constructCommonErrorMonitorInfo = (userInfo: any, errorType: string, type: string): ErrorMonitorInfo => 
    return 
      domain: document?.domain || '',
      openURL: document?.URL || '',
      pageTitle: document?.title || '',
      referrer: document?.referrer || '',
      language: navigator?.language || '',
      userAgent: navigator?.userAgent || '',
      currentUserName: userInfo?.userName || '',
      errorType: errorType || '',
      errorDateTimeWithGMT8: DateUtil.getCurrentTimeByTimeZone(this._timeZone),
      type: type,
      error:  as ErrorContent
     as ErrorMonitorInfo;
  

  public static exportErrorHandleListener = (userInfo: any) => 
    const sendScriptErrorFun = debounce(ErrorMonitorUtil.exportSendErrorMonitorInfo(), this._reportingInterval, this._options);
    const sendResourceErrorFun = debounce(ErrorMonitorUtil.exportSendErrorMonitorInforeact错误处理和日志记录的思考与设计(代码片段)
文章目录React错误处理和日志记录的思考与设计前端需要采集哪些异常数据?前端怎么统一捕获异常?接口异常前端异常如何上报数据?接口异常前端异常存在的问题处理上报异常和通知参考React错误处理和日志记录的思考与设计在...  查看详情  
                
日志监控告警系统的设计与实现(代码片段)
日志监控告警系统基于的日志进行监控，监控需要一定规则，对触发监控规则的日志信息进行告警，告警的方式，是短信和邮件。log4j---->error,info,debug应用程序程序的日志 error级别TimeOutException角标越界IndexXXXExce...  查看详情  
                
springcloudhystrix理解与实践：搭建简单监控集群(代码片段)
...生故障之后，通过断路器的故障监控，向调用方返回一个错误响应，这样就不会使得线程因调用故障服务被长时间占用不释放，避免故障的继续蔓延。SpringCloudHystrix实现了断路器，线程隔离等一系列服务保护功能，它是基于Netfli...  查看详情  
                
高并发监控[一]:tp90tp99耗时监控设计与实现(代码片段)
背景性能测试中,我们经常选择TP90、TP95、TP99等指标项作为性能对比的参考水位,在本文中,我们给出一种计算TP90、TP95和TP99等水位线的方法,首先我们解释一下TP90、TP95、TP99的含义.TP90:即90%的数据都满足某一条件.TP95:即95%的数据都...  查看详情  
                
高并发监控[一]:tp90tp99耗时监控设计与实现(代码片段)
背景性能测试中,我们经常选择TP90、TP95、TP99等指标项作为性能对比的参考水位,在本文中,我们给出一种计算TP90、TP95和TP99等水位线的方法,首先我们解释一下TP90、TP95、TP99的含义.TP90:即90%的数据都满足某一条件.TP95:即95%的数据都...  查看详情  
                
前端代码异常监控总结(代码片段)
...去年8月就起稿了，一直没有发布....】一、前言　　说到前端监控大家应该都不会陌生，这是现代前端工程的标配之一。引入前端监控系统，可以使用例如fundebug，Sentry等第三方监控神器，当然你完全可以自己定制一套符合实际...  查看详情  
                
react错误处理和日志记录的思考与设计(代码片段)
文章目录React错误处理和日志记录的思考与设计前端需要采集哪些异常数据?前端怎么统一捕获异常?接口异常前端异常如何上报数据?接口异常前端异常存在的问题处理上报异常和通知参考React错误处理和日志记录的思考与设计在...  查看详情  
                
前度监控(埋点)设计方案(代码片段)
目录 为什么需要前端监控（目的是什么）？数据监控性能监控异常监控常用的埋点方案前端埋点方案选型和前端上报方案设计前端监控结果可视化展示系统的设计为什么需要前端监控（目的是什么）？我...  查看详情  
                
前端错误监控总结(代码片段)
【要点】1.前端错误的分类2.错误的捕获方式3.上报错误的基本原理 【总结】1.错误分类运行时错误，代码错误资源加载错误 2.捕获方式代码错误捕获　　try...catch:tryconsole.log("欢迎光临！");catch(err)document.getElementById("xxx").in...  查看详情  
                
前端监控之性能与异常
...1前言现有的大部分监控方案都是针对服务端的，而针对前端的监控很少，诸如线上页面的白屏时间是多少、静态资源的加载情况如何、接口请求耗时好久、什么时候挂掉了、为什么挂掉，这些都不清楚。同时，在产品推广过程...  查看详情  
                
编译原理实验二简单计算器的设计与实现(代码片段)
实验二 简单计算器的设计与实现 一、实验目的  综合运行词法分析器、语法分析器等原理实现一个具有加、乘功能的简单计算器，该计算器满足乘法优先级高于加法优先级，且仅处理非负整数。二、实验内容  1....  查看详情  
                
前端异常监控-看这篇就够了(代码片段)
前端异常监控如果debug是移除bug的流程，那么编程就一定是将bug放进去的流程。如果没有用户反馈问题，那就代表我们的产品棒棒哒，对不对？主要内容Web规范中相关前端异常异常按照捕获方式分类异常的捕获方式日志上报的方...  查看详情  
                
前端的错误监控(代码片段)
1、监听代码错误<script>window.addEventListener(‘error‘,function(e)console.log(e,e.lineno));</script>　　window.onerror=function(e,s,l,c,error)console.log(e,s,l,c,error) 2、　跨域代码监控跨域之后   查看详情  
                
简单网络预警系统设计与实现(代码片段)
本学渣在《入侵防御技术及应用》这门课中的课程设计，希望能帮到你的课设或是项目文章目录一、选题内容1.1、问题描述：1.2、要求：1.2.1、界面方面：1.2.2、功能设计：二、实验环境三、方案设计3.1、总体...  查看详情  
                
基于sentry的前端性能监控平台搭建与应用(代码片段)
一、Sentry简介Sentry是一套开源的实时异常收集、追踪、监控系统，支持几乎所有的语音和平台。这套系统由对应各种语言的SDK和一套庞大的数据后台服务组成，通过SentrySDK的配置，可以上报错误关联的版本信息、发布...  查看详情  
                
前端监控(代码片段)
概览为什么要做前端监控前端监控目标前端监控流程编写采集脚本日志系统监控错误监控接口异常白屏监控加载时间性能指标卡顿pv扩展问题性能监控指标前端怎么做性能监控线上错误监控怎么做导致内存泄漏的方法，怎么...  查看详情  
                
前端错误监控原理与实战
本文将从以下几点来分析前端错误监控的相关知识点：1、前端错误的分类；2、错误的捕获方式；3、上报错误的基本原理。 一、前端错误的分类：　　包括两种：　　1、即时运行错误：也就是代码错误；　　2、资源加载错...  查看详情  
                
前端例程003：玻璃拟物化风格（glassmorphism）设计与实现(代码片段)
文章目录目的风格特征代码实现辅助工具总结目的玻璃拟物化风格（Glassmorphism）是这两年有人提出来的一种风格，乍一看和以前的毛玻璃效果很像（至少再我看来是差不多啦~）。玻璃拟物化风格在以前毛玻璃...  查看详情