Large Language Models (LLMs) have rapidly become essential tools in various sectors, from content generation to decision support. However, their increasing utility and complexity bring forth a slew of security and ethical concerns. The intrinsic nature of LLMs, built upon vast data sources, means they can inadvertently disclose sensitive details or execute unintended actions based on ambiguous prompts. As organizations increasingly adopt LLMs into their workflows, understanding these risks and deploying measures to mitigate them is paramount. This overview delves into the primary risks associated with LLMs and offers insights into safeguarding their use
LLM01: Prompt Injections
Vulnerability Overview:
Prompt Injection Vulnerability is an exploit where attackers manipulate a Large Language Model (LLM) to execute unintended actions, either directly through “jailbreaking” or indirectly via external malicious inputs.
Potential Risks:
Manipulated LLMs can mimic harmful personas, leak data, misuse plugins, and trick users without triggering safety alerts.
Examples of Vulnerability:
- Ignoring the application’s original prompt, revealing confidential data.
- Summarising a website with malicious injections, leading the LLM to request sensitive data.
Prevention Strategies:
- Set strict privileges for LLM backend access.
- Include human approval for certain LLM functions.
- Separate and label untrusted content to reduce its influence.
Example Attack Scenarios:
- Injecting prompts into a chatbot for unauthorised access.
- Malicious injection causing an LLM plugin to delete user emails.
LLM02: Insecure Output Handling
Vulnerability Overview:
Insecure Output Handling is a scenario where a system or application fails to scrutinise outputs generated by a Large Language Model (LLM).
Potential Risks:
This vulnerability can lead to issues like Cross-Site Scripting (XSS) and privilege escalation.
Common Examples of Vulnerability:
- Passing LLM output directly into system functions, risking remote code execution.
- LLM-generated JavaScript or Markdown being interpreted by browsers, causing XSS.
Prevention Strategies:
- View the model as a typical user, implementing stringent input validation.
- Follow OWASP ASVS guidelines for input validation and sanitisation.
- Encode LLM outputs back to users to prevent unintended code execution.
Example Attack Scenarios:
- LLM plugin used for chatbot responses influencing a system command for unauthorised access.
- Website summariser tool, powered by an LLM, manipulated to extract and send sensitive data.