Please click the dropdown above to display the final leaderboards. All leaderboard metrics are percentages, and higher is better. The Combined Score metric determines the rankings in the Trojan Detection Track, and the Combined Score (Manual Evaluation) metric determines the rankings in the Red Teaming Track. To view the validation phase leaderboards, please see the CodaLab pages.
Note: The Test Phase leaderboards on the Red Teaming Track CodaLab pages show rankings determined by the automated Combined Score metric. This was used to select the top-ten teams for manual evaluation, which determined final rankings. The official final rankings are shown on this page and are determined by the Combined Score (Manual Evaluation) metric.
The winning participants and teams of each track are shown below (team names are shown in parentheses).
Trojan Detection Track - Base Model Subtrack
Trojan Detection Track - Large Model Subtrack
Red Teaming Track - Base Model Subtrack
Red Teaming Track - Large Model Subtrack
The first-place teams in each track gave talks in the competition workshop describing their methods. The recording of the workshop with these talks will be available soon.
💸 Most Compute-Efficient
⬛ Best Black-Box Method