Here comes the fourth release of my newsletter. This time I have included a lot of content related to the DEFCON AI Village (I have tagged content that comes from there) – a bit late, but better later than never. Anyway, enjoy reading.

Also, if you find this newsletter useful, I’d be grateful if you’d share it with your tech circles, thanks in advance!

Any feedback on this newsletter is welcome – you can mail me or post a comment in this article.

AI Security

Model Confusion – Weaponizing ML models for red teams and bounty hunters [AI Village]

This is an excellent read about ML supply chain security by Adrian Wood. One of the most insightful resources on the ML supply chain that I’ve seen. Totally worth reading!

Link: https://5stars217.github.io/2023-08-08-red-teaming-with-ml-models/

Assessing the Vulnerabilities of the Open-Source Artificial Intelligence (AI) Landscape: A Large-Scale Analysis of the Hugging Face Platform [AI Village]

Researchers have performed automated analysis of 110 000 models from Hugging Face and have found almost 6 million vulnerabilities in the code.

Links: slides: https://aivillage.org/assets/AIVDC31/DSAIL%20DEFCON%20AI%20Village.pdf paper: https://www.researchgate.net/publication/372761501_Assessing_the_Vulnerabilities_of_the_Open-Source_Artificial_Intelligence_AI_Landscape_A_Large-Scale_Analysis_of_the_Hugging_Face_Platform

Podcast on MLSecOps [60 min]

Ian Swanson (CEO of Protect AI) & Emilio Escobar (CISO of Datadog) are talking about ML & AI Security, MLSecOps, Supply Chain Security and LLMs:

Link: https://shomik.substack.com/p/17-ian-swanson-ceo-of-protect-ai

Regulations

LLM Legal Risk Management, and Use Case Development Strategies to Minimize Risk [AI Village]

Well, I am not a lawyer. But I do know a few lawyers who read this newsletter, so maybe you will find these slides on the legal aspects of LLM risk management interesting 🙂

Link: https://aivillage.org/assets/AIVDC31/Defcon%20Presentation_2.pdf

Canadian Guardrails for Generative AI

Canadians have created a document with a set of guardrails for developers and operators of Generative AI systems.

Link: https://ised-isde.canada.ca/site/ised/en/consultation-development-canadian-code-practice-generative-artificial-intelligence-systems/canadian-guardrails-generative-ai-code-practice

LLM Security

LLMSecurity.net – A Database of LLM-security Related Resources

Website by Leon Derczynski (LI: https://www.linkedin.com/in/leon-derczynski/ ) that catalogs various papers, articles and news regarding

Link: https://llmsecurity.net/

LLMs Hacker’s Handbook

This thing was on the Internet for a while, but for some reason I’ve never seen it. LLM Hacker’s Handbook with some useful techniques of Prompt Injection and proposed defenses.

Link: https://doublespeak.chat/#/handbook

AI/LLM as a tool for cybersecurity

ChatGPT for security teams [AI Village]

Some ChatGPT tips & tricks (including jailbreaks) from GTKlondike (https://twitter.com/GTKlondike/)

Link: https://github.com/NetsecExplained/chatgpt-your-red-team-ally

Bonus: https://twitter.com/GTKlondike/status/1697087125840376216

AI in general

Initially, this newsletter was meant to be exclusively related to security, but in the last two weeks I’ve stumbled upon a few decent resources on LLMs and AI and I want to share them with you!

This post by Stephen Wolfram on how does ChatGPT (and LLMs in general) work:
https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

Update of GPT 3.5 – fine-tuning is now available through OpenAI API:

https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates

This post by Chip Huyen on how does RLHF work: https://huyenchip.com/2023/05/02/rlhf.html and this one from Huggingface: https://huggingface.co/blog/rlhf

Some loose links

In this section you’ll find some links to recent AI security and LLM security papers that I didn’t manage to read. If you still want to read more on AI topics, try these articles.

“Does Physical Adversarial Example Really Matter to Autonomous Driving? Towards System-Level Effect of Adversarial Object Evasion Attack”

Link: https://arxiv.org/pdf/2308.11894.pdf

“RatGPT: Turning online LLMs into Proxies for Malware Attacks”

Link: https://arxiv.org/pdf/2308.09183.pdf

“PENTESTGPT: An LLM-empowered Automatic Penetration Testing Tool”

Link: https://arxiv.org/pdf/2308.06782.pdf

“DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection”

Link: https://arxiv.org/pdf/2308.06932.pdf

“Devising and Detecting Phishing: large language models vs. Smaller Human Models”

Link: https://arxiv.org/pdf/2308.12287.pdf