In this short blog post I will show how I have found a way to “attack” Large Language Model with the YouTube video – this attack is called “indirect prompt injection”.
Recently I’ve found LeMUR by AssemblyAI – someone posted it on Twitter and I’ve decided that it may be an interesting target to test for Prompt Injections.
When talking about prompt injections, we distinguish two types – first type is direct prompt injection, in which PI payload is placed in the application by the attacker and the second type is indirect prompt injection, in which the PI payload is carried using third party medium – image, content of the website that is scrapped by the model or audio file.
First of all, I’ve started with generic Prompt Injection that is known from “traditional” LLMs – I just told the model to ignore all of the previous instructions and follow my instruction:
After it turned out that the model follows my instructions, I’ve decided that it would be interesting to check if it will follow instructions directly from the video. I’ve recorded a test video with Prompt Injection payloads:
Unfortunately, I still have had to send instructions explicitly in the form that I’ve controlled:
When I’ve numbered the paragraphs, it turned out that I am able to control the processing of the transcript from the video/transcript level (in this case, the paragraph 4 redirected to paragraph 2 with the prompt injection payload in it, what caused the model to reply simply with “Lol”):
That was the vid:
I tricked the Summary feature to say what I wanted with the same vid:
Instead of summarizing the text, the model just says “Lol”. This specific bug may be used by individuals that don’t want their content to be processed by the automated LLM-based solutions – I don’t judge if it’s a bug, or a feature, neither do I say that LeMUR is insecure (because it’s rather secure) – I just wanted to showcase this interesting case of indirect prompt injection.
If you want to know more about LLM and AI security, subscribe my newsletter: https://hackstery.com/newsletter/