YAML and Jinja in Ansible: Gotchas and what you need to know

Published 2025-04-17

Draft Alert

This is only mostly-finished. Expect some rewrites.

These technologies all seem simple on the surface, and easy enough to learn through examples. But once we start using them in more complex ways, it's important to make sure our understanding is deep and formal.

Please note: much of the information here might seem redundant or obvious, but this post is meant for a wide audience, and seeks to emphesize some info usually taken for granted or widely misunderstood! So please don't feel condescended to if you've been linked here.

Some info also might be oversimplified-- this document is meant to help you learn Enough To Get By, not to be an expert.

What is YAML?

YAML is a:

Standardized
Serialization Language.

Let's review what I mean by that.

It's Standardized

There is a formal specification for it. It's not just some slapdash assortment of various libraries for working with something that looks YAML-ish¹

But you don't have to read the whole thing! The wonderous Learn X in Y minutes where X = YAML is good enough!

There is also the wonderful YAML Multiline Strings site, which will be important later.

It's a Serialization Language

It is a way to encode almost any data structure you can think of, but we're basically using it as an easier-to-write-and-read JSON².

Essentially, you can model any data structure using a combination of these data types which YAML supports (not exhaustive):

booleans (true, false)
numbers (1, 2.0)
strings ("hello world")
arrays / lists
mappings / objects
null

Many programming languages support more than just the above-- they might have formal classes / structs, etc.

YAML allows you to say, e.g. "this mapping is actually a structure called $BLANK" through a feature called "tags". You probably won't see this often, but it'll be important later!

Here's a python example:

class Person:
      name: str
      age: int

maintainer: !!person
  name: Yaml McYamlface
  age: 21 # yaml 1.0 was January 2004!

YAML rules to know about

The Three Kinds of Strings

One of YAML's major strengths over JSON is how you only need to quote strings if they include certain special characters. The values of the following mapping are all the same:

plain-style: Hello world!
single-quotes: 'Hello world!'
double-quotes: "Hello world!"

Protip

YAML Multiline Info will be helpful for this next part.

What if you have a really really long string, that would make the document hundreds of characters wide? You can make a string go over multiple lines, but then you have to do the classic "escape the newline itself" bit, which kinda looks ugly.

multiple-lines: "This string goes \
  over multiple lines."

If we want the markdown-like behavior of "a newline is a space, and two newlines is a newline", YAML has us covered:

text-block: >
  This will be a
  single line of text, but...

  This is a new line.
# but it keeps the trailing newline, so text-block
# ends with 'new line.\n'

# we can "chomp / strip" the end with `-`:
no-trailing: >-
  Look ma, no newline!

Inline Mappings and Lists

They exist! These are valid:

inline-mapping: {name: Foo, age: 100}
inline-list: [1, 2, 3]

Keep this in mind for later.

What is Jinja?

Jinja is a templating language. It also has a handy Learn X in Y Minutes where X = Jinja page.

The short version of what you need to know is that expressions inside {{ }} get evaluated. If you have a variable food set to tacos, then running Let's eat {{ food }} tonight through jinja will get you Let's eat tacos tonight.

How does Ansible use YAML and Jinja?

YAML resources (playbooks, tasks files, vars files) in the YAML format.
It can reference variables using Jinja.
Here's a simple version of what that looks like:

- name: An example playbook
  hosts: localhost
  vars:
    whomst: world
  tasks:
    - name: Say hello
      ansible.builtin.debug:
        msg: Hello {{ whomst }}!

Essentially, this happens in two phases:

Ansible parses the YAML to get the structure of the playbook.
When it runs tasks, it will evaluate each argument, potentially evaluating jinja expressions.

Remember!

These are separate steps.

A skirmish of syntaxes

What if we want to template the greeting too?

- name: An example playbook
  hosts: localhost
  vars:
    greeting: Hola
    whomst: world
  tasks:
    - name: Say hello
      ansible.builtin.debug:
        msg: {{ greeting }} {{ whomst }}!

That last line is an issue. How can YAML parse this bit?

msg: {{ greeting }} {{ whomst }}!
     ^ well there's your problem.

YAML sees a { and gets ready for an inline mapping that never comes. You might ask: why it doesn't just look ahead one character to see it's a jinja expression? The answer is because YAML doesn't know anything about jinja! That's ansible-specific.

What if ansible added special support for that case? Well, then it wouldn't be YAML anymore. Any and all tooling that expects YAML would have to be changed. We already said we're using something standardized, not "something that happens to look like YAML".

Ansible is nice and will tell you what the error is:

The offending line appears to be:

      ansible.builtin.debug:
        msg: {{ greeting }} {{ whomst }}!
                            ^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:

    with_items:
      - {{ foo }}

Should be written as:

    with_items:
      - "{{ foo }}"

Now the YAML parser knows it's a string, and Ansible will still happily evaluate the jinja expression when calling the debug module.

When and how does Jinja get evaluated?

Ansible will evaluate jinja lazily and recursively. Take a look at these files:

static1: static1 value
ref2: Referencing {{ static2 }}

static2: static2 value
ref1: Referencing {{ static1 }}

- name: An example playbook
  hosts: localhost
  vars_files:
    - vars1.yaml
    - vars2.yaml
  tasks:
    - ansible.builtin.debug:
        msg: "{{ ref1 }}"
    - ansible.builtin.debug:
        msg: "{{ ref2 }}"

Not only do you have variables referencing other variables in other files, but variables referencing variables get used as variables! ref1 -> static1 -> static1 value works, in other words.

Where the intersection gets messy

Where "messy" means hard to think about at first.

A review of two modules

Ansible can template arbitrary files / jinja expressions and put the output into a file. It's commonly used for config files.

{{ user }} = very cool

- name: An example playbook
  hosts: all
  vars:
    user: You
  tasks:
    - name: Template some config file
      ansible.builtin.template:
        src: config.j2
        dest: /etc/user-coolness
        owner: root
        group: wheel
        mode: 'o=rw,g=rw,a=r'

You can include vars files dynamically, instead of just using vars at the playbook level.

env_name: Development

env_name: Production

- name: An example playbook
  hosts: "{{ deploy_env }}"
  tasks:
    - name: Include variables for environment
      ansible.builtin.include_vars:
        file: "{{ deploy_env }}.yaml"
    - name: Print environment name
      ansible.builtin.debug:
        msg: "Deploying into {{ env_name }}"

Now we're ready to tackle the confusing bit.

When is YAML not YAML?

Answer: when you don't treat it as such. A YAML file is text, and can just be treated as text. ansible.builtin.template doesn't care about whatever syntax its input file is, essentially all it looks for is {{ or {% and their matching closers, and then handles whatever is between. It doesn't care if it happens to be valid YAML or Shakespeare, it just processes what's in there and spits it out.

Remember how we said this wasn't valid yaml?

howdy: {{ greeting }} {{ whomst }}!

template doesn't care!

- name: Template a file, don't care what format
  hosts: localhost
  vars:
    greeting: Hola
    whomst: world
  tasks:
    - name: Template
      ansible.builtin.template:
        src: template.yaml.j2
        dest: out.yaml

Running this will output a completely valid yaml file:

howdy: Hola world!

Please don't do this if you can avoid it.

Hot Take Alert

Click to open rant.

The legacy of Unix has given the tech world the belief that "slamming bits of text together is fine".

This is most painfully felt when templating whitespace-sensitive languages.

As cool as Helm is, when templating yaml, the template is evaluated before the YAML is parsed. You have to control the whitespace yourself to make sure any templated maps come out correctly. And if you mess it up, you might even get incorrect data instead of an error! Sure hope that PersistentVolumeClaim wasn't important.

If the structure of your data is important, then you should be using tools that are aware of that structure.

Who templates the templatemen?

Okay, but what if you want to template a vars file? Maybe you have an automated process that creates a vars file per resource in some other system. A perfectly cromulent use case, maybe.

This is a contrived example, but let's pretend it's a structure like vars/users/username.yaml. It'll contain fields for GECOS data and the like. But we want it to reference certain standard groups, no matter the user.

Here's what a finished, rendered user record would look like:

full_name: Kevin M Granger
employee_id: 12345
office_phone: "1-555-867-5309"
groups: '{{ standard_groups + other_groups }}'

We might use a template like this:

full_name: "{{ full_name }}"
employee_id: "{{ employee_id }}"
office_phone: "{{ office_phone }}"
groups: !unsafe "{{ standard_groups + other_groups }}"

!unsafe is a tag that tells ansible that the value should not be evaluated by jinja. You'll see what that results in below. This happens at the Ansible level. It doesn't change how YAML parses it, but the tag is passed from the YAML parser to Ansible itself.

Cconsume it like so:

- name: Onboard a user
  hosts: localhost
  vars_prompt:
    - name: username
      prompt: Username?
      private: false
    - name: full_name
      prompt: Full name?
      private: false
    - name: employee_id
      prompt: Employee ID?
      private: false
    - name: office_phone
      prompt: Office phone number?
      private: false
  tasks:
    - name: Fill out user profile
      ansible.builtin.include_vars:
        file: vars/users.skel.yaml
        name: new_user

    - name: Save user profile
      ansible.builtin.copy:
        content: "{{ new_user | to_nice_yaml(sort_keys=false) }}"
        dest: vars/users/{{ username }}.yaml

Then inputting the expected values will give you:

full_name: Kevin M Granger
employee_id: '12345'
office_phone: '1-555-867-5309'
groups: '{{ standard_groups + other_groups }}'

As you can see-- waitaminute.

employee_id: '12345'

That's a string, not a number. Jinja used to only return strings from its templates. They eventually added support for keeping the types in the template's output, but that's a backwards-incompatible change, so it's disabled by default. You can change the setting in either ansible.cfg or through the environment variable.

Anyways, do you see what happened to groups? !unsafe told ansible that the data was plain 'ol YAML data, nothing more and nothing less.

That also means when it gets spat back out into a file, it's serialized as the string it is-- meaning if it's read in again, it'll be evaluated as Jinja!

In this case, that's exactly what we want. But keep that in mind.

Lessons Learned

You should learn YAML, don't just guess how it works.
Ansible parses YAML, and then evaluates Jinja when called for.
I'm sure there was something else I was going for here, but I'm tired. TODO.

Buggy implementations aside 🙂

It's technically not a superset of JSON, as many claim. Still, it's unlikely you'll find that out the hard way.