Prizes - The Trojan Detection Challenge

There is a $30,000 prize pool split across prizes and special awards. Monetary prizes will be awarded to the top three teams in the leaderboard for each track according to the values below. Final ordering in the leaderboard will be determined on the test phase data using the primary metric for each track, with secondary metrics used for tie-breaking as specified in the Tracks page. Each team that wins the first-place prize in a track will also be invited to co-author a publication summarizing the competition results and will be invited to give a short talk at the competition workshop at NeurIPS 2023 (registration provided). Special awards will also be given for top submissions satisfying certain criteria.

Eligibility

To be eligible for prizes, winning teams are required to share their methods, code, and models with the organizers as well as the names and associations of each team member.

Geographical Restrictions

We will be unable to transfer prize money into China, India, Russia, Somalia, Iran, Cuba, Sudan, Syria, or North Korea. This is either due to rules in the US limiting the transfer of money into these countries or rules in these countries making it difficult to transfer money into them from US nonprofits (notably, this is why India is on the list).

Important Details:

Individuals living in the above countries are still allowed to participate in the competition and may receive non-monetary awards for winning submissions (co-authorship and an invitation to present at the workshop). We simply cannot transfer money into these countries.
If individuals with citizenship in these countries live and have bank accounts outside of these countries, then we can transfer money into those bank accounts.
If a team consists of individuals living in the above countries and individuals living in other countries, then we will still be able to transfer prize money to the team members who live and have bank accounts outside of the above countries. Team members may then distribute the prize money among themselves.

Trojan Detection Track

Large Model Subtrack

🥇 1st place: $5,000
🥈 2nd place: $2,500
🥉 3rd place: $1,000

Base Model Subtrack

🥇 1st place: $3,000
🥈 2nd place: $1,000
🥉 3rd place: $500

Red Teaming Track

Large Model Subtrack

🥇 1st place: $5,000
🥈 2nd place: $2,500
🥉 3rd place: $1,000

Base Model Subtrack

🥇 1st place: $3,000
🥈 2nd place: $1,000
🥉 3rd place: $500

Awards

💸 Most compute-efficient: $500

For the team in the final top-ten whose method requires the smallest amount of compute to generate a submission for the test phase*

*Awarded for each of the four subtracks. Requires sharing code and models with the organizers.

⬛ Best black-box method: $500

For the best-performing submission to the test phase that only requires black-box access to the LLM in question*

*Awarded for each of the four subtracks. Requires sharing code and models with the organizers. Submissions can be considered for this award if the method used to generate the submission treats models as APIs that return logits or labels.