attheoaks.com

Understanding YAML Deserialization Vulnerabilities and Mitigation

Written on

Chapter 1: Introduction to YAML Deserialization

Deserialization attacks are increasingly prevalent in programming languages like Java, Python, and Ruby. These vulnerabilities arise when data streams are deserialized without adequate validation, potentially allowing the execution of remote code. In this article, we will explore a specific deserialization technique within the context of YAML.

Before we delve into YAML deserialization, let's clarify the concepts of serialization and deserialization.

Section 1.1: What is Serialization?

Consider an online game where your character has various attributes such as username, avatar, clothing, rank, and weapons. How are these attributes communicated and stored on the server? The answer lies in serialization.

Defining Serialization

Serialization is the process of converting an object into a byte stream or a flat structure. This byte stream, often referred to as a simplified version of the object, can be transmitted over networks or saved in files, databases, and more.

Deserialization, the counterpart to serialization, is the process of transforming these byte streams back into their original object form.

Potential Vulnerabilities

The vulnerability arises when a web server accepts a serialized value without validating it before deserializing. If users can manipulate the serialized value, it could lead to unexpected behaviors during deserialization.

Section 1.2: What is YAML?

YAML, which stands for "Yet Another Markup Language," is defined by Wikipedia as "a human-readable data-serialization language." It is commonly employed for configuration files and in scenarios where data needs to be stored or transmitted. YAML utilizes Python-style indentation for nesting and features a compact format using [] for lists and {} for maps.

Unlike traditional programming languages, YAML lacks a strict format, making it distinct.

Example of YAML Serialization

Consider the following un-serialized data:

{

'name': 'Manish',

'age': 12,

'skills': ['programming', 'soft skills']

}

Upon serialization, it would appear as:

name: Manish

age: 12

skills:

  • programming
  • soft skills

Understanding Vulnerabilities in YAML

While this example is not inherently vulnerable, issues can arise when serialized objects are executed instead of merely deserialized. Libraries like PyYAML and ruamel.yaml are commonly used but can be prone to deserialization attacks if insecure methods are employed.

Serialization Methods

Common serialization methods include:

  • dump()
  • dump_all()
  • safe_dump()
  • safe_dump_all()

To deserialize, methods such as:

  • load()
  • load_all()
  • full_load()
  • full_load_all()
  • safe_load()
  • safe_load_all()

Chapter 2: Creating Payloads

To illustrate the creation of payloads, we utilize the __reduce__() method, which functions with both PyYAML and ruamel.yaml, assuming the backend operates on a Unix-based system.

import yaml

import subprocess

class Payload(object):

def __reduce__(self):

return (subprocess.Popen, ('ls',))

deserialized_data = yaml.dump(Payload())

print(deserialized_data)

The serialized output will be:

!python/object/apply:subprocess.Popen

  • ls

Payload Explanation

In this example, we create a payload using the __reduce__() method and the subprocess module to spawn a new process. The Popen method executes a command, in this case, ls, which lists files in the current directory.

When this payload is input into a vulnerable web application, it not only deserializes but also executes the command.

deserialized_data = yaml.load(data)

The command executes successfully, revealing the contents of the current directory.

Remediation Strategies

This vulnerability was assigned CVE-2017-18342 and was patched in PyYAML version 5.1. It is advisable to ensure that your version of PyYAML is greater than 5.1 (the current version is 6.0).

However, if you wish to bypass this vulnerability, it depends on the version in use. While certain unsafe methods can be executed, specific conditions must be met.

Note: If attempting this, consider using PyYAML version < 5.1. The load() method will fail with the above payload unless you specify "Loader=Loader" or change it to unsafe_load(data, Loader=Loader).

Conclusion

Deserialization vulnerabilities can lead to severe repercussions, including remote code execution or privilege escalation. Therefore, it is crucial to sanitize and validate any incoming data before deserializing it.

Learn how to deserialize YAML in C# with YamlDotNET.

A comprehensive YAML course from beginner to advanced, particularly for DevOps and more!

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Maximize Your Efficiency: Time-Saving Python Automation Scripts

Discover practical Python scripts to automate tasks and save time in your daily life.

Understanding the Impact of Electrolyte Imbalances on Health

Explore the crucial role of electrolytes in our body and the consequences of imbalances due to dehydration and other factors.

Exploring the Potential of Probiotics in Mood Regulation

Investigating how probiotics may influence mood and cognitive health through gut microbiome interactions.

# Take a Nap to Boost Your Productivity and Combat Fatigue

Discover how short naps can enhance your productivity and overall well-being while combating midday fatigue.

Exploring

A comprehensive review of Geoffrey A. Moore's

Exploring the Roots and Impacts of Conspiracy Theories

An in-depth analysis of the origins and societal consequences of conspiracy theories, highlighting key discussions from a notable BBC podcast.

How Grab Conquered Uber in Southeast Asia: A Strategic Overview

Explore the strategic maneuvers that allowed Grab to outperform Uber in Southeast Asia, focusing on localization and partnerships.

Sharing Your App Idea: Why Openness Can Lead to Success

Discover why sharing your app concept can lead to invaluable feedback and ultimately enhance your chances of success.