Welcome to the 7th release of the Real Threats of Artificial Intelligence Newsletter.
Below you’ll find some interesting links – if you are an offensive security practitioner, take a look at Kaggle/AI Village DEFCON Capture The Flag competition, where you can challenge your AI hacking skills (it’s still going for the next 2 weeks). I’d also recommend the talk “AI’s Underbelly: The Zero-Day Goldmine” by Dan McInerney from ProtectAI. This talk inspired me to create this post: https://hackstery.com/2023/10/13/no-one-is-prefect-is-your-mlops-infrastructure-leaking-secrets/
I’ve also started cataloging AI Security / AI Red Teaming job offers – check the “Jobs” section, if you consider stepping into the AI Security industry.
If you find this newsletter useful, I’d be grateful if you’d share it with your tech circles, thanks in advance! What is more, if you are a blogger, researcher or founder in the area of AI Security/AI Safety/MLSecOps etc. feel free to send me your work and I will repost it in this newsletter 🙂
Source: Bing Image Creator
LLM Security
New release of OWASP Top10 for LLM
A new version of OWASP Top10 for LLM was released. More examples, increased readability etc. are present in this release. They also added this diagram that highlights how the vulnerabilities intersect with the application flow:
Link: website of the project: https://owasp.org/www-project-top-10-for-large-language-model-applications/
Simon’s post on LinkedIn: https://www.linkedin.com/pulse/new-release-owasp-top-10-llm-apps-steve-wilson
17 chars LLM jailbreak by @AIPanic
This guy is a wizard of prompts. Usually, “Do Anything Now” prompts are long and complicated. @AIPanic proves that just a few chars is enough to trigger the model to return harmful content.
Killer Replika chatbot
In 2021, a man broke into Windsor Castle with a crossbow. Later, he told the police that Replika chatbot told him to assassinate the Queen of England. Recently, he got sentenced
Link: https://www.theregister.com/2023/10/06/ai_chatbot_kill_queen/
AI-based coding assistants may leak API keys
GitHub Copilot and Amazon CodeWhisper can be coaxed to emit hardcoded credentials that these AI models captured during training, though not all that often.
Link: https://www.theregister.com/2023/09/19/github_copilot_amazon_api/
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
Authors demonstrate an automated method of generating semantically meaningful jailbreaks.
Link: https://arxiv.org/abs/2310.04451
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Link: https://arxiv.org/abs/2310.06387
GPT-4 is too smart to be safe: stealthy chat with LLMs via cipher
This promising paper (currently under review) presents an approach for jailbreaking LLMs through usage of ciphers – i.e. Caesar cipher etc.
Link: https://openreview.net/pdf?id=MbfAK4s61A
Chatbot hallucinations are poisoning the web search (possible paywall)
A short story on how hallucinations from the chatbots poisoned GPT-powered Bing Chat.
Link: https://www.wired.com/story/fast-forward-chatbot-hallucinations-are-poisoning-web-search/
4chan users manipulate AI tools to unleash torrent of racist images
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models (by Microsoft Research)
A paper is from July, but it was reposted on MS website a few days ago. Taxonomy of LLM-related risks can be a good starting point for Threat Modeling LLMs:
Links: https://techcrunch.com/2023/10/17/microsoft-affiliated-research-finds-flaws-in-gtp-4/,
https://www.microsoft.com/en-us/research/blog/decodingtrust-a-comprehensive-assessment-of-trustworthiness-in-gpt-models/,
https://github.com/AI-secure/adversarial-glue
AI Security
AI Security Has Serious Terminology Issues
What is the difference between AI Security, AI Safety, AI Red Teaming and AI Application Security? In this blog post, Joseph Thacker proposed the boundaries of each of the terms in order to make them more precise.
Link: https://josephthacker.com/ai/2023/10/16/ai-security-terminology-issues.html
AI Village CTF
Better late than never – this CTF ends on 9th of November – you can still give it a try and check your AI hacking skills!
Link: https://www.kaggle.com/competitions/ai-village-capture-the-flag-defcon31/
AI’s Underbelly: The Zero-Day Goldmine
Inspiring talk on MLOps/AIOps tools security by Dan McInerney:
Link: https://www.youtube.com/watch?v=e3ybnXjtpIc
Six steps for AI security
Post by Nvidia.
Source: Nvidia
Link: https://blogs.nvidia.com/blog/2023/09/25/ai-security-steps/
AI/LLM as a tool for cybersecurity
Compliance.sh
This AI-supported tool makes it easier to get compliant with ISO 27001, SOC 2 Type II, HIPAA, GDPR and more:
Link: https://compliance.sh/
Check for AI
This is a pretty convenient tool for detection of AI-generated text:
Link: https://www.checkfor.ai/
AI safety
To be honest usually I concentrate more on AI Security and I occasionally follow what’s going on in the world of AI Safety. Those resources look super cool – just check those designs!
Map of AI Existential Safety
In this map, whole set of resources related to the AI Safety is collected:
Link: https://aisafety.world/
Neuronpedia
In this game, you help with crowdsourcing explanations for the neurons inside of the neural networks:
Link: https://www.neuronpedia.org/
Frontier Model Forum will fund AI safety research
Frontier Model Forum announced that it’ll pledge $10 million toward a new fund to advance research on tools for “testing and evaluating the most capable AI models.”
Link: https://techcrunch.com/2023/10/25/ai-titans-throw-a-tiny-bone-to-ai-safety-researchers
Jobs
Senior Security Engineer – GenAI @ Amazon
Link: https://www.amazon.jobs/en/jobs/2444074/senior-security-engineer-genai-amazon-stores
Offensive Security Engineer – AI Red Team @ Microsoft
Link: https://jobs.careers.microsoft.com/us/en/job/1633942/Offensive-Security-Engineer-II–AI-Red-Team
Senior Security Researcher (AI Security) @ Microsoft
Link: https://jobs.careers.microsoft.com/us/en/job/1583887/Senior-Security-Researcher-���-AI-Security
AI Security Lead @ Bytedance
Link: https://jobs.bytedance.com/en/position/7270039018820536632/detail
AI Security Lead @ TikTok
Link: https://careers.tiktok.com/position/7232214286985103671/detail
Senior ML Security Engineer @ Snowflake
Link: https://careers.snowflake.com/us/en/job/SNCOUS6944604002EXTERNALENUS/Senior-ML-Security-Engineer
Software Dev Engineer II, AI Security @ Amazon
Link: https://www.amazon.jobs/en/jobs/2462392/software-dev-engineer-ii-ai-security
Technical Program Manager, Security @ Anthropic
Link: https://jobs.lever.co/Anthropic/580d8f10-24c6-46a7-9d44-0116e95e568b
Other AI-related things
Killer drones used in Ukraine
If these reports are true, the first war drones that work without human supervision are being deployed in the battlefields in the Ukraine against Russians:
Advent of Code prohibits the usage of LLMs
Link: https://adventofcode.com/about#ai_leaderboard
If you want more papers and articles
IN-CONTEXT UNLEARNING: LANGUAGE MODELS AS FEW SHOT UNLEARNERS, Pawelczyk, et. al.
Link: https://arxiv.org/pdf/2310.07579.pdf
Composite Backdoor Attacks Against Large Language Models, Huang, et. al.
Link: https://arxiv.org/pdf/2310.07676.pdf
Low-Resource Languages Jailbreak GPT-4, Yong, et.al.