Libvirt and OpenTofu

February 16, 2025

I’ve been using Terraform and Terragrunt to manage infrastructure on OpenStack and AWS for quite a long time, and I made the transition to OpenTofu (almost painlessly) when Hashicorp changed Terraform’s licensing. Recently I wanted to quickly create some VMs to play with a cloud-init configuration I was working on.

Since my laptop has qemu/kvm available via libvirt, I decided to try a libvirt provider for tofu. In line with other posts on here, I wanted the primary out-of-band management interface to be serial, and since my motivation for this was to play with cloud-init I decided to start from one of the cloud images for ubuntu server. Here are the terraform/tofu declarations I came up with (main.tf).

terraform {
  required_providers {
    libvirt = {
      source = "dmacvicar/libvirt"
    }
  }
}

provider "libvirt" {
  uri = "qemu:///system"
}

resource "libvirt_volume" "os_image_ubuntu" {
  name = "os_image_ubuntu"
  pool = "default"
  source = "https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img"
}

resource "libvirt_volume" "disk_ubuntu_resized" {
  name           = "disk"
  base_volume_id = libvirt_volume.os_image_ubuntu.id
  pool           = "default"
  size           = 10737418240
}

resource "libvirt_cloudinit_disk" "cloudinit_ubuntu_resized" {
  name = "cloudinit_ubuntu_resized.iso"
  pool = "default"

  user_data = <<EOF
#cloud-config
disable_root: true
ssh_pwauth: false
users:
  - name: ptty2u
    sudo: ALL=(ALL) NOPASSWD:ALL
    ssh_authorized_keys:
      - ${file("id_ubuntu_resized.pub")}

growpart:
  mode: auto
  devices: ['/']
EOF
}

resource "libvirt_domain" "domain_ubuntu_resized" {
  name   = "domain_ubuntu_resized"
  memory = "512"
  vcpu   = 1

  cloudinit = libvirt_cloudinit_disk.cloudinit_ubuntu_resized.id

  network_interface {
    network_name   = "default"
    wait_for_lease = true
  }

  console {
    type = "pty"
    target_port = "0"
    target_type = "serial"
  }

  console {
    type = "pty"
    target_type = "virtio"
    target_port = "1"
  }

  disk {
    volume_id = libvirt_volume.disk_ubuntu_resized.id
  }
}

output "ip" {
  value = libvirt_domain.domain_ubuntu_resized.network_interface[0].addresses[0]
}

The config references an ssh public key which should be in the same directory as main.tf and can be generated the usual way (ssh-keygen -t ed25519 -f ./id_ubuntu_resized).

Running tofu init, then tofu plan will show you the planned changes, then tofu apply will grab the cloud image, defines all the resources and starts the virtual machine. As soon as the domain is defined, you should be able to virsh console <domain> to connect to the serial line and monitor the boot progress. The cloud init config will make some minor user tweaks and resize the root partition. When you’re done, terraform destroy will tidy the whole thing up for you.

Debugging with Qemu

While getting the above working, I had the opportunity to qemu directly to debug some issues. I normally rely on libvirt to specify all of the qemu config options, but it is kind of interesting to see the raw qemu equivalents. The cloud-init docs have a nice example which I followed pretty closely.

I started by, downloading the cloud image (the qcow2 below), and creating the various cloud-init config files.

The vendor-data file is normally supplied by the vendor (unsurprisingly), and for the nocloud provider it can safely be empty

$ touch vendor-data

The meta-data file normally contains information about the instance itself and for our case we can just add a unique identifier and hostname

$ vi meta-data
instance-id: someid/somehostname

Finally the user-data defines the majority of our customizations. Our example will disable the root user, disable passwords logins over ssh, create a ptty2u user and set a random password (the random password will be spit out among the cloud-init messages on the system console, and will be changed on first login). N.B. This isn’t meant to be a practical config, it’s just an exploration of some of the user options.

$ vi user-data
#cloud-config
disable_root: true
ssh_pwauth: false
users:
- name: ptty2u
  sudo: ALL=(ALL) NOPASSWD:ALL
chpasswd:
  users:
  - name: ptty2u
    type: RANDOM

cloud-init expects these three files to be available over the network which can achieve using the python http.server module

$ python -m http.server --directory .

Finally, here is the qemu command (the command below uses UEFI, but that isn’t a necessary part of what we’re doing).

$ qemu-system-x86_64 \
  -device virtio-net-pci,netdev=vmnic0 -netdev user,id=vmnic0 \
  -machine accel=kvm \
  -m 512 \
  -nographic \
  -hda Fedora-Cloud-Base-Generic-41-1.4.x86_64.qcow2 \
  -smbios type=1,serial=ds='nocloud;s=http://10.0.2.2:8000/' \
  --bios /usr/share/ovmf/x64/OVMF.4m.fd

Most of the options are obvious, but the last line does something interesting. As explained in the cloud-init docs, it uses the system management BIOS to specify that we are using the nocloud provider and that there will be a webserver running on port 8000 of the default QEMU gateway. Cloud-init uses that information to know where to grab the files we created above.

If all went well above, the machine will boot and apply our cloud-init config. When you are done testing, you can kill the VM with Ctrl-a x.

There are a couple of things to keep in mind if you repeat the steps above

The qcow2 is modified in place, so if you make a mistake and start again, remember that cloud init will think it has already run.
If you have a schema error somewhere in your cloud-init config, the machine will usually boot as if no config had been specified. In this circumstance, you can often log in with the default credentials of the cloud image and use sudo cloud-init schema --system to fix the problem.

Have fun!