Risks vs Incidents: Cause Mismatch
How two AI-risk corpora describe the causes of harm, per subdomain — the risks catalogued from academic and policy sources versus the incidents logged in real-world reports. The chart surfaces where the two place different emphasis on who caused a harm and with what intent. Percentages use coded-only denominators, excluding “Not coded” records.
Systematic shift, risks → incidents
median gap across 15 reliable subdomainsDiscrimination & Toxicity
1.2 Toxic content
76 coded risks · 90 coded incidents · 66% coded
1.3 Unequal performance
17 coded risks · 34 coded incidents
1.1 Discrimination
82 coded risks · 118 coded incidents
Privacy & Security
2.2 AI security vulnerabilities
111 coded risks · 21 coded incidents
2.1 Loss of privacy
77 coded risks · 88 coded incidents
Misinformation
3.1 False information
53 coded risks · 187 coded incidents
Malicious Actors & Misuse
4.1 Disinformation & influence
82 coded risks · 135 coded incidents
4.2 AI weapons & cyberattacks
80 coded risks · 13 coded incidents
4.3 AI fraud & scams
77 coded risks · 394 coded incidents
Human-Computer Interaction
5.1 Overreliance & unsafe use
60 coded risks · 36 coded incidents
Socioeconomic & Environmental
6.1 Power centralization
51 coded risks · 6 coded incidents
6.2 Inequality & unemployment
54 coded risks · 6 coded incidents
6.3 Devaluation of human creativity
31 coded risks · 5 coded incidents
AI System Safety, Failures & Limitations
7.3 Capability & robustness
123 coded risks · 302 coded incidents
7.4 Transparency & interpretability
41 coded risks · 5 coded incidents
Key Takeaways
- 1.1.2 Toxic content has the largest reliable mismatch (Unintentional +52.6pp in intent).
- 2.Incidents lean more AI system relative to the risk literature in 80% of reliable subdomains (median gap: +28.8pp).
- 3.Incidents lean more Intentional relative to the risk literature in 87% of reliable subdomains (median gap: +14.4pp).
- 4.9 subdomains with fewer than 5 incidents are confidence-weighted. Percentages use coded-only denominators (excluding "Not coded" records).
Why the gaps aren’t “prediction errors.” The two corpora are sampled by opposite filters — the risk literature by research attention, the incident record by what gets publicly reported and submitted — and are coded by different teams against different source material. A risk is also an abstract claim while an incident is a specific event, so their distributions are not strictly commensurable. Timing is omitted because an incident is post-deployment by definition, leaving no meaningful comparison. Treat this as an exploratory map of where research concern and the documented incident record diverge, not as a measure of forecast accuracy.