Severity standardization in Code4rena

As with any activity that depends on the assessment of risk, it’s crucial to implement a standardization process. This ensures impartiality and consistency in responses. Let’s take a look at our approach to severity standardization at Code4rena.

What is severity standardization?

Code4rena’s severity rubric ensures that Wardens, Judges and Sponsors have a shared set of reference points for how contest submissions are judged. The process we’ve put in place gives Judges a criteria to follow when assessing the severity, validity, and quality of findings so that they can accurately rate the performance of Wardens in the context of a specific contest.

At the time of writing, the Code4rena risk assessment matrix is as follows. Where assets refer to funds, NFTs, data, authorization, and any information intended to be private or confidential:

High: Assets can be stolen/lost/compromised directly (or indirectly if there is a valid attack path that does not have hand-wavy hypotheticals).
Medium: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.
QA (Quality Assurance) Includes both Non-critical (code style, clarity, syntax, versioning, off-chain monitoring (events, etc) and Low risk (e.g. assets are not at risk: state handling, function incorrect as to spec, issues with comments). Excludes Gas optimizations, which are submitted and judged separately.

For findings that don’t fit neatly into the matrix above, trust is placed in the hands of the Judge reviewing the finding to come to their own independent conclusion in alignment with our criteria. Doing this ensures impartiality throughout the judging process. In cases like this, explaining and rationalizing the potential impact of a finding is essential to having it judged fairly. The burden of proof lies with the Warden submitting the finding, and increases alongside the potential value of the submission (rarity, severity).

Why did Code4rena implement a severity standardization process?

As mentioned earlier in this article, it’s crucial to have a severity standardization process in place to ensure consistency and impartiality. Before this existed, judges were weighted with the task of doing their best to make a fair judgement on a contest’s award allocation. At times, this made them question the tooling, process, and whether or not they were correctly interpreting the spirit of the law.

Over the past year, Judges and the C4 team have collaborated back and forth in open discussions, which has created an evolving meta. This meta is unique to C4 and enables both the organizations being audited and the auditors themselves to be part of a platform that is self-reflective and is constantly iterating on its processes for their collective benefit.

The rules established as part of the severity standardization process act as a starting point, and these open discussions act as a growing set of case law examples. You can view the open forum where these discussions are held here.

Has severity standardization improved the quality of audit reports?

In short, yes. Being held to the same standards across every contest regardless of scope has systematically increased the assessment quality of findings across the board. Now, there can be confidence in the assigned levels of findings, knowing that a high is a high based on logic, and so on. The combination of Code4rena’s crowd-sourced approach which enables as many experts as possible to contribute their unique knowledge along with the standardization of assessment gets organizations the best of both worlds: breadth and depth.

In addition to severity standardization, a pre-sort stage has also been added to the audit process. Those involved in this review and organize all submissions for the contest with an eye to lightening and clarifying the sponsor team’s workload. All unique High and Medium risk findings are de-duplicated and grouped for efficient review, and the highest quality submissions within each set of duplicates are highlighted to ensure the sponsor is able to review the best articulation of the vulnerability and recommendations for mitigation. This reduces the burden on Sponsors by filtering out all spam submissions from their contest and highlighting what’s really impactful. Time is saved, focus can be sharpened, and effort can be placed where it is most effective.

As a result of these developments, Wardens are rewarded more equitably, and Sponsors are empowered to address legitimate concerns according to their severity. We’ve seen a shift towards our style of severity standardization in the industry, signalling a general consensus that this is a move in the right direction.

As mentioned previously, discussions about severity standardization are conducted in the C4 repo.

Got questions? Reach out to the C4 team in our Discord.