Web3 Has No Safe AI. DMind AI Just Quantified the Gap — and KDD 2026 Made It Official.
PR Newswire
SINGAPORE, May 31, 2026
The first peer-reviewed Web3 AI benchmark tests 31 top models — including GPT-5, Claude, and Gemini — across 3,543 expert questions. The verdict: no system is ready for the field’s highest-stakes tasks.
SINGAPORE, May 31, 2026 /PRNewswire/ — Medical AI has MedQA. Financial AI has FinBen. Legal AI has LegalBench. Web3 — one of the most adversarial, financially consequential software environments in existence — had nothing. Today, that changes.
DMind AI, in collaboration with researchers from Zhejiang University and Nanyang Technological University (NTU), announces that its research paper “DMind Benchmark: Toward a Holistic Assessment of LLM Capabilities across the Web3 Domain” has been accepted at KDD 2026 — the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, widely regarded as the world’s most prestigious venue for AI and data science research. The paper will be presented in Jeju, Korea, August 9–13, 2026.
The Verdict: 31 Models Tested. None Ready for Web3.
DMind Benchmark evaluated 31 of the world’s leading AI systems — including GPT-5, Claude, Gemini, DeepSeek, and Qwen. The results are a clear warning for any organization deploying AI in Web3 today:
- Safety-critical domains are where AI fails most. Performance collapses in security vulnerability detection and token economics reasoning — exactly where AI failure translates into irreversible financial loss.
- No model is production-ready. Even top-performing systems reveal capability gaps unacceptable in a real-world Web3 audit or governance context.
- Reasoning cannot be faked. Adversarial fine-tuning on the full benchmark yielded gains of less than one point — confirming genuine multi-step reasoning cannot be replaced by memorization.
- A practical path forward exists. Pareto efficiency analysis identifies which models offer the best performance-per-cost ratio for organizations integrating AI into Web3 workflows today.
“The verdict is clear: today’s AI models are not yet safe for unsupervised deployment in Web3’s most critical workflows. DMind Benchmark is the diagnostic tool the industry has been missing — and now, for the first time, we can measure the gap and close it.”
— DMind AI Research Team
Why This Matters: Billions at Stake in an Unforgiving Environment
Web3 is not like other software domains. Smart contracts are immutable once deployed. DeFi protocols manage billions of dollars in real assets. A single vulnerability can — and repeatedly has — result in catastrophic, irreversible financial loss. Deploying unreliable AI in this environment is not a theoretical risk: it is measured in capital destroyed, protocols collapsed, and user trust shattered.
Yet until now, the AI industry had no credible way to answer a fundamental question: can current large language models actually be trusted in Web3 workflows?
“Web3 is an adversarial, high-stakes environment where a small reasoning error can translate into an exploitable contract or a failed protocol. We built DMind Benchmark because the field needed a rigorous, domain-grounded standard — not just a general knowledge test.”
— DMind AI Research Team
About DMind Benchmark: Built for the Real Web3 World
DMind Benchmark comprises 3,543 expert-curated questions spanning nine core Web3 domains — including Smart Contracts, DeFi, Security Vulnerabilities, Token Economics, and DAOs. Built by five domain specialists each with over eight years of frontline blockchain experience, it draws from a provenance-tracked corpus of 6.1 GB of data across 39 authoritative sources.
Its contamination-aware design ensures models cannot cheat by memorizing answers. Adversarial fine-tuning experiments confirm that only genuine domain reasoning — not rote recall — produces high scores.
Academic Validation and Proven Traction
KDD 2026 acceptance elevates DMind Benchmark into a formally recognized scientific standard — the definitive reference point for any organization evaluating, developing, or deploying AI in Web3. Since its open-source release on Hugging Face in April 2025, the benchmark reached the #1 trending position on Hugging Face for nearly a full week and accumulated over 9,650 downloads by January 2026.
“KDD acceptance gives this work a level of academic validation that the Web3 AI field has been missing. As one of the first peer-reviewed Web3 AI benchmarks accepted at a top AI and data science venue, DMind Benchmark helps move the conversation beyond hype toward measurable capability, safety, and trust. It establishes a rigorous foundation for evaluating whether AI systems are truly ready for high-stakes decentralized environments.”
— Prof. Feida Zhu, Associate Dean of Partnership & Engagement, School of Computing and Information Systems, Singapore Management University
The dataset and full evaluation toolkit are publicly available: https://huggingface.co/datasets/DMindAI/DMind_Benchmark
Research Spotlight: Meet a Key Author
Enhao Huang is a 2022-intake undergraduate in Information Security at Zhejiang University and a direct-entry doctoral candidate at the National Key Laboratory of Blockchain and Data Security. His research focuses on the security of large language models and intelligent agents.
A researcher of exceptional early-career achievement, Huang has:
- Led a project funded by the National Natural Science Foundation of China Youth Student Special Program
- Published or accepted 10 papers at top venues including KDD, WWW, S&P, and ICLR
- Served as invited reviewer for NeurIPS, ACL, ICML, and other leading conferences
- Named primary inventor on 8 granted or published invention patents
His contributions to the DMind Benchmark reflect the collaboration’s commitment to grounding AI safety research in world-class academic rigor.
Bridging Research and Reality: DMind AI and Minara
The same conviction behind DMind Benchmark — that Web3 deserves AI held to the highest standards — drives the strategic partnership between DMind AI and Minara, an AI assistant purpose-built for Web3 users.
General-purpose AI assistants lack the domain depth to reliably audit smart contracts, navigate DeFi protocol mechanics, or assess governance proposals. As DMind’s research makes clear, the consequences are not just suboptimal outputs — they are genuine security risks.
Together, DMind AI and Minara are working to translate rigorous academic findings into real-world tools that Web3 developers, security auditors, DeFi traders, protocol teams, and everyday users can rely on today. Where the benchmark defines the standard, the partnership works to meet it — and continuously raise the bar.
About DMind AI
DMind AI is a Singapore-based artificial intelligence company dedicated to building safe, reliable, and domain-specialized AI for the Web3 ecosystem. At the intersection of large language models, blockchain technology, and cryptoeconomic reasoning, DMind AI’s mission is to make AI trustworthy enough for the highest-stakes decentralized environments in the world.
Media Contact
DMind AI | Singapore
Website: https://dmind.ai
The DMind Benchmark paper is co-authored by researchers from DMind AI, Zhejiang University, and Nanyang Technological University. Full author list and paper details will be published in the KDD 2026 proceedings.
View original content to download multimedia:https://www.prnewswire.com/news-releases/web3-has-no-safe-ai-dmind-ai-just-quantified-the-gap–and-kdd-2026-made-it-official-302786392.html
SOURCE DMind AI

