attheoaks.com

Understanding YAML Deserialization Vulnerabilities and Mitigation

Written on

Chapter 1: Introduction to YAML Deserialization

Deserialization attacks are increasingly prevalent in programming languages like Java, Python, and Ruby. These vulnerabilities arise when data streams are deserialized without adequate validation, potentially allowing the execution of remote code. In this article, we will explore a specific deserialization technique within the context of YAML.

Before we delve into YAML deserialization, let's clarify the concepts of serialization and deserialization.

Section 1.1: What is Serialization?

Consider an online game where your character has various attributes such as username, avatar, clothing, rank, and weapons. How are these attributes communicated and stored on the server? The answer lies in serialization.

Defining Serialization

Serialization is the process of converting an object into a byte stream or a flat structure. This byte stream, often referred to as a simplified version of the object, can be transmitted over networks or saved in files, databases, and more.

Deserialization, the counterpart to serialization, is the process of transforming these byte streams back into their original object form.

Potential Vulnerabilities

The vulnerability arises when a web server accepts a serialized value without validating it before deserializing. If users can manipulate the serialized value, it could lead to unexpected behaviors during deserialization.

Section 1.2: What is YAML?

YAML, which stands for "Yet Another Markup Language," is defined by Wikipedia as "a human-readable data-serialization language." It is commonly employed for configuration files and in scenarios where data needs to be stored or transmitted. YAML utilizes Python-style indentation for nesting and features a compact format using [] for lists and {} for maps.

Unlike traditional programming languages, YAML lacks a strict format, making it distinct.

Example of YAML Serialization

Consider the following un-serialized data:

{

'name': 'Manish',

'age': 12,

'skills': ['programming', 'soft skills']

}

Upon serialization, it would appear as:

name: Manish

age: 12

skills:

  • programming
  • soft skills

Understanding Vulnerabilities in YAML

While this example is not inherently vulnerable, issues can arise when serialized objects are executed instead of merely deserialized. Libraries like PyYAML and ruamel.yaml are commonly used but can be prone to deserialization attacks if insecure methods are employed.

Serialization Methods

Common serialization methods include:

  • dump()
  • dump_all()
  • safe_dump()
  • safe_dump_all()

To deserialize, methods such as:

  • load()
  • load_all()
  • full_load()
  • full_load_all()
  • safe_load()
  • safe_load_all()

Chapter 2: Creating Payloads

To illustrate the creation of payloads, we utilize the __reduce__() method, which functions with both PyYAML and ruamel.yaml, assuming the backend operates on a Unix-based system.

import yaml

import subprocess

class Payload(object):

def __reduce__(self):

return (subprocess.Popen, ('ls',))

deserialized_data = yaml.dump(Payload())

print(deserialized_data)

The serialized output will be:

!python/object/apply:subprocess.Popen

  • ls

Payload Explanation

In this example, we create a payload using the __reduce__() method and the subprocess module to spawn a new process. The Popen method executes a command, in this case, ls, which lists files in the current directory.

When this payload is input into a vulnerable web application, it not only deserializes but also executes the command.

deserialized_data = yaml.load(data)

The command executes successfully, revealing the contents of the current directory.

Remediation Strategies

This vulnerability was assigned CVE-2017-18342 and was patched in PyYAML version 5.1. It is advisable to ensure that your version of PyYAML is greater than 5.1 (the current version is 6.0).

However, if you wish to bypass this vulnerability, it depends on the version in use. While certain unsafe methods can be executed, specific conditions must be met.

Note: If attempting this, consider using PyYAML version < 5.1. The load() method will fail with the above payload unless you specify "Loader=Loader" or change it to unsafe_load(data, Loader=Loader).

Conclusion

Deserialization vulnerabilities can lead to severe repercussions, including remote code execution or privilege escalation. Therefore, it is crucial to sanitize and validate any incoming data before deserializing it.

Learn how to deserialize YAML in C# with YamlDotNET.

A comprehensive YAML course from beginner to advanced, particularly for DevOps and more!

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Overcoming Gaslighting: 5 Essential Lessons Learned

Discover five key lessons to help you navigate and overcome gaslighting, based on personal experiences and insights.

Mild Recession Signals Stagflation Amidst Job Growth Trends

Job growth persists even as the economy faces stagflation and a mild recession, challenging inflation control and economic policies.

Overcome Fear of Judgment to Live Authentically and Fully

Discover how to break free from the fear of judgment and embrace your true self with practical steps and insights.

The Journey of Choice: Alice's Awakening in Wonderland

Explore Alice's transformative journey in

Exploring Earth's Future: What Awaits Us in 100 Million Years

A journey into the next 100 million years of Earth's evolution and humanity's place within it, exploring potential futures and transformations.

DAOs: The Underdogs and Why I Believe in Their Future

Explore six compelling quotes that inspire confidence in DAOs and the importance of perseverance in the face of doubt.

Fitness at 65+: Why Staying Active is Essential for Longevity

Staying active over 65 significantly boosts health and longevity. Discover the benefits of fitness and protein intake for older adults.

# Embracing Time: A Journey of Self-Discovery for Liana

Explore Liana's journey of self-discovery as she learns to navigate time and personal growth.