Artificial Intelligence Security

This post discusses some aspects of artificial intelligence regarding security.

AI Chatbot Security Concerns

Summary of AI Chatbot Security Concerns

Upload sensitive information
Malicious use of the application
Misinformation

Upload Sensitive Information

Users or employees may upload sensitive information to the website.

Malicious use of the Application

This aspect is not exclusive to ChatGPT. In fact, any tool that could be used be used for malicious intents (e.g., an e-mail account) presents a risk.

The main concern is that a tool with so much potential as ChatGPT means both potential benefits and potential misuse.

https://hbr.org/2023/04/the-new-risks-chatgpt-poses-to-cybersecurity

Misinformation

Users could be misinformated by chatbots, in many aspects: technical, political, ethical, etc. This could be deliberate or because of errors in the chatbot.

Take into account that this risk exists on any other tool, like media, newspapers, social networks, books, etc.

AI Attacks

AI Attacks:

Prompt injection
Jailbreak
Meta prompt leakage

Prompt injection (PI) is skipping customization.

Cross-domain prompt injection

Jailbreak is skipping the purpose of he model.

Meta prompt leakage means that the information about prompt is extracted by the user.

Extraction prompt injection attack (XPIA) implies modifying an external source that is used by the model to exploit the model in an unexpected way.

AI Security Frameworks

Microsoft Responsible AI

Google Responsible AI

Google Responsible AI official website

AI Security Controls

Control layers in GenAI:

Model
Safety System
Application
Positioning

Guardrails, also known as prompt seal or IA firewall is a security measure for IA prompts to filter the output given to users.

NeMo is an open source guardrail developed by NVidia.

A fundamental rights impact assessment (FRIA) is a systematic examination of potential risks that an AI system might pose to fundamental rights.

AI Risk

The system cards are published in documents by publishers like OpenAI..

The Artificial Intelligence Risk Management Framework (AI RMF) is published by NIST.

NIST AI RMF official website

MIT AI risk repository

MIT AI risk repository official website

MITRE ATLAS Matrix by MITRE.

MITRE ATLAS Matrix official website

AI Compliance

You can read this post about artificial intelligence in Spain.