HKUST-WeBank Joint Laboratory Seminar on DeepSeek 一 网传DeepSeek R1更容易被越狱?这有个入选顶会的防御框架SelfDefend

3:00pm - 4:00pm
VooV Meeting
Event Format
Speakers / Performers:
Prof. Shuai WANG
Department of Computer Science and Engineering
Language
Mandarin
Recommended For
Faculty and staff
PG students
UG students
More Information

HKUST-WeBank 联合实验室分享 一一网传DeepSeek R1更容易被越狱?这有个入选顶会的防御框架SelfDefend

当全球惊叹于DeepSeek等大模型突破性性能时,其安全防线却暗藏危机一一宾夕法尼亚大学测试显示,DeepSeek R1面对50类恶意攻击提示时防御全面失守,越狱成功率高达100%!如何让AI在强大之余学会自我保护?针对这一挑战,本次分享提出通用防御框架SELFDEFEND,首次将传统安全领域的影子堆栈shadow stack)理念引入大模型攻防战场。其核心创新在于构建双模型协作机制: 影子LLM化身实时安检系统,与目标模型并行运行,通过动态意图分析拦截有害查询,实现'提问-应答'链路的毫秒级安全熔断。”

 

登记参于在线研讨会

 

 

Organizer
Office of Knowledge Transfer
Registration
Contact

如有任何疑问,请联系 Mr. Ryan Dong (ryandong@Webank.com), Mr. Joshua Leung (joshua.leung@ust.hk)

Post an event
Campus organizations are invited to add their events to the calendar.