Dec 17, 2024

From Zero to Automation: How I Used ChatGPT to Create My First Ansible Playbook

How I Used ChatGPT to Create My First Ansible Playbook

2640 Words Words // ReadTime 12 Minutes

2024-12-17 22:36 +0100

Introduction

I recently decided to automate the startup and shutdown of my lab environments—both standard and nested labs. While the idea sounded simple, it quickly turned into an interesting challenge. Having never written an Ansible Playbook before, I turned to ChatGPT for help.

Why ChatGPT?

Let’s be honest: starting with Ansible can feel overwhelming, especially if you’re new to it. My last experience with something remotely similar was years ago, working with PowerShell scripts or even earlier with .NET 3 (yes, I’m “that old”).

The task itself seemed straightforward at first:

Write a playbook to power VMs on and off in a controlled manner.
Integrate both my standard lab and nested lab (e.g., my VCF setup with its own vCenter).

However, the challenge revealed itself quickly:

Controlling VMs via my main vCenter is relatively easy.
But what about nested labs where each nested setup has its own vCenter?

This is where ChatGPT became a game changer.

The Approach

Starting from Zero

I described my setup and goals to ChatGPT:

Automate VM startup/shutdown. Handle dependencies like nested vCenters that control their own VMs. ChatGPT provided a clear starting point, explaining how to structure an Ansible playbook. Step by step, it introduced me to tasks, loops, and the required VMware modules.

Iterating Through Challenges

The major challenge was managing nested environments:

Powering on the parent vCenter first. Waiting until it’s responsive. Then triggering the startup sequence for the nested VMs managed by that vCenter. Through multiple iterations, ChatGPT helped refine the logic.

Not Always Smooth Sailing

To be honest, ChatGPT’s suggestions weren’t always perfect. More than once, I found myself in a dead end. I had to point out repeatedly that the same solution, presented for the third time, simply didn’t work. This is the reality of working with AI: it doesn’t replace expertise, but it certainly accelerates the process.

While ChatGPT couldn’t solve everything on its own, it significantly simplified finding the right solution. Instead of starting from scratch or digging through documentation for hours, I could focus on testing and refining the playbook.

Current Progress: What I Achieved in Two Evenings

After a couple of evenings, with a few hours of experimenting and iterating with ChatGPT, I managed to create four modular Ansible playbooks. These playbooks are designed to handle two key scenarios for starting and stopping VMs:

Two Playbooks for Environments with vCenter

These playbooks are for my standard (non-nested) lab environments, where I can rely on vCenter to manage the VMs. With vCenter in place, controlling VMs is relatively straightforward, as vCenter provides a central interface to handle power states. Two Playbooks for Environments without vCenter

These playbooks handle environments where no vCenter is available, such as nested labs or standalone ESXi hosts. In nested labs, the challenge arises because VMs and their dependencies are controlled individually, without the convenience of a central management interface. By separating the logic into modular playbooks, I ensured flexibility and reusability across my different lab setups. Whether I’m dealing with my regular homelab VMs or complex nested environments like my VCF setup, I can now efficiently start and stop VMs with a single command.

Inventory Files: The Backbone of the Setup

To make the playbooks flexible and reusable, I created inventory YAML files for each lab. Out of habit, I named them something like vcfvm_vars.yml or vcfesx_vars.yml. These files act as the variable storage for each lab environment.

There are two types of inventory files:

For Nested VMs:

Includes variables specific to nested lab setups, such as nested vCenter credentials, VM names, and their dependencies. For Non-Nested VMs:

Stores details for standard VMs managed directly via the main vCenter.

Nested VCF Example: Controlled Boot and Shutdown

In my VCF setup, which is fully nested, the playbook must follow a strict sequence:

Startup:

Start the nested ESXi hosts first. Wait for their availability. Then start the nested management VMs, such as NSX Manager, SDDC Manager, and vCenter. Shutdown:

Stop the management VMs first. Once the management layer is powered down, shut down the nested ESXi hosts. This controlled sequence ensures the nested environment behaves predictably.

Inventory File for ESXi Hosts

# vcfesx_vars.yml
vcenter_hostname: "vcsa.lab.home"
vcenter_username: "administrator@vsphere.local"
vcenter_password: "your_pw"
vcenter_datacenter: "Homelab"
validate_certs: false

vm_names:
  - "sfo01-m01-esx01"
  - "sfo01-m01-esx02"
  - "sfo01-m01-esx03"

Inventory File for Nested VMs

# vcfvm_vars.yml
validate_certs: false
esxi_hosts:
  - "sfo01-m01-esx01.lab.home"
  - "sfo01-m01-esx02.lab.home"
  - "sfo01-m01-esx03.lab.home"
esxi_username: "root"
esxi_password: "your_pw"
esxi_datacenter: "sfo-m01-dc01"
vm_names:
  - "vcfvcsa"
  - "vcfnsx01a"
  - "vcf01"

Power-On Playbook for Non-Nested VMs

---
- name: Start specific VMs in vCenter
  hosts: localhost
  gather_facts: no
  collections:
    - community.vmware

  tasks:
    - name: Load variables from file
      include_vars: "{{ vars_file }}"
    - name: Connect to vCenter and start VMs
      community.vmware.vmware_guest_powerstate:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item }}"
        state: powered-on
      loop: "{{ vm_names }}"
      register: power_state_result

    - name: Display power state result
      debug:
        msg: "VM {{ item.item }} wurde erfolgreich gestartet."
      when: item.instance.hw_power_status == "poweredOn"
      loop: "{{ power_state_result.results }}"
      loop_control:
        label: "{{ item.item }}"

include_vars: Loads a variable file, such as vcfvm_vars.yml, which makes the playbook modular and reusable.
community.vmware.vmware_guest_powerstate: Uses the vmware_guest_powerstate module to control the power state of VMs in a vCenter-managed environment.
The state: powered-on option ensures VMs are powered on.
register: power_state_result: Captures the result of the task execution for each VM, including its power state.
debug with when: Checks the power state of each VM and displays a success message if the VM was successfully powered on.

Power-On Playbook for Nested VMs on Multiple ESXi Hosts

---
- name: Power on multiple VMs on multiple ESXi hosts
  hosts: localhost
  gather_facts: no
  collections:
    - community.vmware

  tasks:
    - name: Load variables from file
      include_vars: "{{ vars_file }}"

    - name: Get VM power status for each VM on each ESXi host
      community.vmware.vmware_guest_info:
        hostname: "{{ item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.1 }}"
      with_nested:
        - "{{ esxi_hosts }}"
        - "{{ vm_names }}"
      register: vm_info_results
      ignore_errors: true

    - name: Filter VMs that are poweredOff
      set_fact:
        powered_off_vms: "{{ vm_info_results.results | selectattr('failed', 'equalto', false)
                           | selectattr('instance.hw_power_status', 'equalto', 'poweredOff') }}"

    - name: Power on VMs if they are poweredOff
      community.vmware.vmware_guest_powerstate:
        hostname: "{{ item.item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item.1 }}"
        state: powered-on
      loop: "{{ powered_off_vms }}"
      loop_control:
        label: "Host: {{ item.item.0 }} | VM: {{ item.item.1 }}"
      register: poweron_results

    - name: Wait for VMs to be powered on
      community.vmware.vmware_guest_info:
        hostname: "{{ item.item.item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item.item.1 }}"
      loop: "{{ poweron_results.results }}"
      loop_control:
        label: "Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}"
      register: vm_status
      until: vm_status.instance.hw_power_status == "poweredOn"
      retries: 20
      delay: 15
      when: item.failed == false

    - name: Display power on result
      debug:
        msg: "VM {{ item.item.item.1 }} on Host {{ item.item.item.0 }} has been successfully powered on."
      loop: "{{ poweron_results.results }}"
      loop_control:
        label: "Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}"

vmware_guest_info Retrieves the power state of each VM on each ESXi host.
set_fact Filters out only those VMs that are powered off.
vmware_guest_powerstate Powers on each VM that is in a “poweredOff” state.
wait_for with retries Ensures that the VMs are fully powered on before proceeding.
debug Displays a confirmation message for each successfully powered-on VM.

Master Playbook

to orchestrate the two Power-On playbooks in the correct order. I kept your current 60-second pause timer as a placeholder for checking ESXi server readiness but structured everything neatly for clarity. A 60-second pause ensures that the ESXi hosts have enough time to initialize. Why a Pause? Without an active feedback mechanism to confirm the ESXi servers are ready, this static wait acts as a temporary workaround and will replaced later.

---
- name: Power on Nested ESXi Hosts
  import_playbook: poweron_vcsa.yml
  vars:
    vars_file: "vcfesx_vars.yml"
  # Executes the playbook to power on the nested ESXi hosts.
  # Variables specific to ESXi servers are loaded from "vcfesx_vars.yml".

- name: Wait for 60 seconds before powering on nested VMs
  hosts: localhost
  gather_facts: no
  tasks:
    - name: Pause for 60 seconds
      pause:
        seconds: 60
      # A static wait time to ensure ESXi hosts are ready.
      # This will be improved in the future with dynamic checks.

- name: Power on Nested Management VMs
  import_playbook: poweron_esx.yml
  vars:
    vars_file: "vcfvm_vars.yml"
  # Executes the playbook to power on nested VMs like NSX Manager, SDDC Manager, and vCenter.
  # Variables specific to management VMs are loaded from "vcfvm_vars.yml".

Starting the VMs via the Ansible Master Playbook

Starting my VCF nested lab has never been easier. With the Ansible Master Playbook, it’s as simple as running a single command on my Ansible server:

ansible-playbook mp_poweron_vcf.yml

Within approximately 5-10 minutes (depending on the overall load on my lab), the entire VCF environment is up and ready to use—without any further manual intervention.

The beauty of this setup lies in its flexibility:

New labs can be easily added by simply creating a new inventory file and a customized master playbook. The core logic remains untouched, making it a scalable and modular solution for automating additional environments. This approach not only saves time but also ensures consistency when starting up complex nested labs like my VCF setup.

Ansible Log — Ansible Output (click to enlarge)

The log output of my Ansible playbook contains failed messages during the task: Get VM power status for each VM on each ESXi host These failures occur because each ESXi host is queried for specific VMs (like vcf01) that may not exist on that particular host. This is both normal and expected behavior.

Why? Due to DRS (Distributed Resource Scheduler), I can never be certain which nested ESXi host a particular VM was last running on. By iterating through all ESXi hosts, the playbook ensures that the power status of every VM is eventually retrieved, regardless of where it was previously located.

Shutdown Playbook: Graceful Power-Off of VMs

The shutdown process follows the same principles as the power-on playbook but in reverse order. Instead of starting VMs, it ensures a graceful shutdown while verifying their power state. I won’t describe every task in detail, but here’s a quick overview:

Logic Similar to Power-On:

VMs are iterated across multiple ESXi hosts.
Only VMs that are currently powered on are gracefully shut down.

Graceful Shutdown with Validation:

VMs are shut down using shutdown-guest to trigger the guest OS shutdown process.
A retry loop with retries: 20 and delay: 15 ensures that the playbook actively checks until the VMs reach the poweredOff state.

Harmless Errors Handled:

As with the power-on playbook, the ignore_errors: true directive handles expected failures gracefully (e.g., querying for VMs on ESXi hosts where they are not located).

Shutdown Nested VMs

- name: Graceful shutdown of multiple VMs on multiple ESXi hosts
  hosts: localhost
  gather_facts: no
  collections:
    - community.vmware

  tasks:
    - name: Load variables from file
      include_vars: "{{ vars_file }}"
      
    - name: Get VM power status for each VM on each ESXi host
      community.vmware.vmware_guest_info:
        hostname: "{{ item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.1 }}"
      with_nested:
        - "{{ esxi_hosts }}"
        - "{{ vm_names }}"
      register: vm_info_results
      ignore_errors: true

    - name: Filter VMs that are poweredOn
      set_fact:
        powered_on_vms: "{{ vm_info_results.results | selectattr('failed', 'equalto', false)
                           | selectattr('instance.hw_power_status', 'equalto', 'poweredOn') }}"

    - name: Shut down VMs if they are poweredOn
      community.vmware.vmware_guest_powerstate:
        hostname: "{{ item.item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item.1 }}"
        state: shutdown-guest
        force: false
      loop: "{{ powered_on_vms }}"
      loop_control:
        label: "Host: {{ item.item.0 }} | VM: {{ item.item.1 }}"
      register: shutdown_results

    - name: Wait for VMs to be powered off
      community.vmware.vmware_guest_info:
        hostname: "{{ item.item.item.0 }}"
        username: "{{ esxi_username }}"
        password: "{{ esxi_password }}"
        datacenter: "{{ esxi_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item.item.1 }}"
      loop: "{{ shutdown_results.results }}"
      loop_control:
        label: "Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}"
      register: vm_status
      until: vm_status.instance.hw_power_status == "poweredOff"
      retries: 20
      delay: 15
      when: item.failed == false

Shutdown Playbook for Virtual ESXi Servers Using vCenter

This playbook is very similar to the nested VM shutdown playbook, but since I can rely on the vCenter, I don’t need to iterate through all ESXi servers. This simplifies the process and improves efficiency.

---
- name: Graceful shutdown of specific VMs if powered on
  hosts: localhost
  gather_facts: no
  collections:
    - community.vmware


  tasks:
    - name: Load variables from file
      include_vars: "{{ vars_file }}"
    - name: Get VM information
      community.vmware.vmware_guest_info:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        datacenter: "{{ vcenter_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item }}"
      loop: "{{ vm_names }}"
      register: vm_info_results

    - name: Shut down VMs gracefully if powered on
      community.vmware.vmware_guest_powerstate:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item }}"
        state: shutdown-guest
        force: false
      when: item.instance.hw_power_status == "poweredOn"
      loop: "{{ vm_info_results.results }}"
      register: shutdown_results
      loop_control:
        label: "{{ item.item }}"

    - name: Wait for VMs to be powered off
      community.vmware.vmware_guest_info:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        datacenter: "{{ vcenter_datacenter }}"
        validate_certs: "{{ validate_certs }}"
        name: "{{ item.item }}"
      register: vm_status
      until: vm_status.instance.hw_power_status == "poweredOff"
      retries: 20
      delay: 15
      loop: "{{ vm_info_results.results }}"
      when: item.instance.hw_power_status == "poweredOn"
      loop_control:
        label: "{{ item.item }}"

    - name: Display shutdown result
      debug:
        msg: "VM {{ item.item }} ist erfolgreich heruntergefahren oder war bereits ausgeschaltet."
      loop: "{{ vm_info_results.results }}"
      loop_control:
        label: "{{ item.item }}"

Use of vCenter:

The playbook uses vCenter directly to manage the shutdown process, which avoids manually iterating through all ESXi hosts. Graceful Shutdown:
The shutdown-guest option triggers a clean shutdown of the guest operating system running on the virtual ESXi servers. Dynamic Verification:
The playbook dynamically filters the powered-on ESXi VMs and waits until their power state is confirmed as poweredOff. Efficiency:
By leveraging vCenter and a loop with retries, the process is both clean and efficient.

Master Shutdown Playbook

To orchestrate the shutdown of the nested VCF lab and its virtual ESXi servers, we’ll create a master playbook similar to the Power-On master playbook. The inventory files remain the same as those used for the Power-On process, ensuring consistency and avoiding duplication.

- name: Poweroff Nested VMs   
  import_playbook: shutdown_esx.yml
  vars:
    vars_file: "vcfvm_vars.yml"
- name: Poweroff Nested ESXi
  import_playbook: shutdown_vcsa.yml
  vars:
    vars_file: "vcfesx_vars.yml"

Unlike the Power-On master playbook, the shutdown process does not require a pause or workaround. This is because during the shutdown, we can actively check if the respective VMs have already powered off using a loop. This makes the process cleaner and more efficient.

Conclusion: Is ChatGPT Useful for Ansible?

From my perspective, the answer is both yes and no.

ChatGPT gave me a solid starting point and explained a lot of the foundational concepts, which was extremely helpful as a beginner with Ansible. However, it wasn’t perfect—there were several significant errors in the generated playbooks, and more than once, the AI proposed the same incorrect solution repeatedly.

Despite these challenges, I still found the process enjoyable. With some manual corrections and adjustments, I was able to create playbooks that worked for my specific environment. Within just a few hours, I achieved a usable result—something that would have taken considerably longer without ChatGPT’s assistance.

Ultimately, while ChatGPT cannot replace expertise or thorough testing, it’s a powerful tool to accelerate development and simplify learning, especially when working with automation tools like Ansible.