Terraform data vs resources

In previous blog posts we have talked about Installing Terraform on Windows, Installing Terraform on Ubuntu, Providers part 1 and Providers part 2. In this blog we will look at differentiating between data sources and resources. Note that we are looking at terraform constructs differently from what is presented in the HashiCorp Terraform Associate Certification and not following the suggested study guide format but diving into what is required to build something from scratch.

The assumptions that we are going to make in this blog are that first we have a terraform binary installed on Windows or Ubuntu and second that our target system is going to be vSphere initially either for development or in house production systems.

Given that we talked about the vSphere provider in an earlier blog posting we won’t go into a discussion on adding and configuring providers other than to say that vSphere is a tricky configuration when it comes to managing multiple servers. If we were managing multiple Amazon or Azure regions or zones we would define one provider for one zone and another provider with a different alias for the second zone. The vSphere provider does not support multiple datacenters or aliases for providers so managing multiple hosts with terraform must be done with one vSphere server controlling multiple hosts or multiple folders and directories managing different vSphere servers. Both of these strategies work well. For this example we will look at multiple servers under one vSphere and a single terraform configuration to manage all of these hosts.

Before we get started on data vs resource definitions it is important to note that the Terraform Associate certification focuses on Cloud Engineering and not necessarily on-premises servers. The concepts are generic enough but for this blog we will cover vSphere as our example target.

Scrolling down to the outline of the exam note that “Create and differentate resource and data configuration” is one of the topics in the Read, generate, and modify configuration section.

If we look at the documentation for the Terraform CLI for Data Sources it notes that data sources allow data to be fetched or computed for use elsewhere in Terraform configuration. Each provider may offer data sources alongside its set of resource types. What this means is that we can declare things that we know exist and should not change. For vSphere we know that there is a datacenter as well as datastore. The datacenter is the root of the vSphere environment. Under the datacenter is a host and a datastore. If we correlate this back to our vSphere console we can pull this data from the user interface.

From our lab environment we log into the vSphere Client and go to the Summary screen of the root vSphere instance.

Note that we have 4 hosts displayed on the left side, 4 hosts listed in the center as well as 23 virtual machines. We do have a number of templates and virtual machines listed under our server 10.0.0.92 with two of the virtual machines currently running.

For our Terraform vSphere provider we need to first get the name of the Datacenter associated with this installation. We can do this by clicking on the Datacenters tab on the right side of the screen.

For this example we see that “Home-lab” is the Datacenter defined for our instance. In Terraform we would define this with a data element rather than a resource.

data "vsphere_datacenter" "datacenter" {
  name = "Home-lab"
}

The key reason that we use data rather than resource is that the terraform command will try to create this datacenter when an apply is run and destroy the datacenter when a destroy is run. We don’t want to destroy our datacenter but want to retain the definition to use as needed to access hosts and virtual machines.

From the Resources definition in the Terraform CLI, a resource describes one or more infrastructure object and running the apply command creates, updates, or destroys the object to force them to match the configuration in the configuration files. If your configuration gets changed then the terraform configuration tries to delete and reconfigure your datacenter. This is not the behavior that we want. What we want is to define everything that is static and should not change with the data declaration and things that are dynamic with the resource declaration. The elements that typically don’t change for vSphere are:

  • Datacenter
  • Host
  • Network
  • Datastore

Things that typically change but are good to use data declarations for are

  • Folders
  • Templates
  • Tags

Things that typically change are need a resource definition are

  • Virtual Machines
  • VMFS Disks
  • Roles
  • Users

To reference a data element you start with the data.<value> and use the type definition followed by the alias name. In the following example we define a host that is associated with a compute cluster that is inside a datacenter. We refer to the datacenter as data.vsphere_datacenter.dc and the cluster as data.vsphere_compute_cluster.c1 where “dc” and “c1” are aliases to identify the data definitions. We use the id parameter in the data construct to uniquely identify the resource.

data "vsphere_datacenter" "dc" {
  name = "TfDatacenter"
}

data "vsphere_compute_cluster" "c1" {
  name          = "DC0_C0"
  datacenter_id = data.vsphere_datacenter.dc.id
}

resource "vsphere_host" "h1" {
  hostname = "10.10.10.1"
  username = "root"
  password = "password"
  license  = "00000-00000-00000-00000i-00000"
  cluster  = data.vsphere_compute_cluster.c1.id
}

In the above example we are talking about the host “10.10.10.1” located in the compute cluster “DC0_C0” that lives in the TfDatacenter. In our Home-lab datacenter we can define our host 10.0.0.92 as a member of the Home-lab datacenter and provision virtual machines onto this host.

We would also define our vsphere_datastore defined by “name” that lives in the data.vsphere_datacenter.dc. To view the datastores from the vSphere Client, click on the Datastores tab or the disk icon to get a list of potential datastores.

Note that some of the datastores are listed as inacessible. These servers are currently turned off and the accessible ones are connected to our primary host and vSphere server which is the only one powered on for demonstration purposes.

In summary, data declarations should be used for static and permanent structures. Resource declarations should be used for things that can be created or considered transient or malleable. For a production datacenter that I managed we defined folders, virtual machine templates, and datastores with the data declaration.

data “vsphere_datacenter” “dc” {
name = “QM Lab”
}

data “vsphere_resource_pool” “pool” {
name = “TintonFalls/Resources”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_host” “host” {
name = “172.19.21.54”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “DS6” {
name = “DS6”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “EQL1-Raid5” {
name = “EQL1-Raid5”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “EQL2-Raid6” {
name = “EQL2-Raid6”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4-maglib1” {
name = “ADV-ESX4 maglib1(7.2K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4-maglib2” {
name = “ADV-ESX4 maglib2(7.2K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4storage1” {
name = “ADV-ESX4 storage1(15K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4storage2” {
name = “ADV-ESX4 storage2(15K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4storage3” {
name = “ADV-ESX4 storage3(15K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4storage4” {
name = “ADV-ESX4 storage4(15K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “ADV-ESX4storage5” {
name = “ADV-ESX4 storage5(15K)”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_datastore” “corenfsshare” {
name = “corenfsshare”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_network” “network” {
name = “VM Network”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “W2k8-sp18-template” {
name = “W2k8-sp18-template”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “Windows2016-template” {
name = “2016_template”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “W2k16-Aug-2020-template” {
name = “W2k16-Aug-2020-template”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “centos-base” {
name = “centos-base”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “rhel-base” {
name = “rhel-base”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “suse-sap-base” {
name = “suse-sap-base”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “te-kubernetes-base” {
name = “te-kubernetes-base”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “ubuntu-base” {
name = “ubuntu-base”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “ubuntu-docker-desktop” {
name = “ubuntu-docker-desktop”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_virtual_machine” “Windows-10-template” {
name = “Windows-10-template”
datacenter_id = data.vsphere_datacenter.dc.id
}

data “vsphere_folder” “parent” {

path = “${data.vsphere_datacenter.dc.name}/vm/Technical Enablement”
}

data “vsphere_folder” “Apps” {
path = “${data.vsphere_folder.parent.path}/Apps”
}

data “vsphere_folder” “testing” {
path = “${data.vsphere_folder.Apps.path}/testing”
}

data “vsphere_folder” “sp22” {
path = “${data.vsphere_folder.Apps.path}/sp22”
}

data “vsphere_folder” “sp21” {
path = “${data.vsphere_folder.Apps.path}/sp21”
}

data “vsphere_folder” “sp20” {
path = “${data.vsphere_folder.Apps.path}/sp20”
}

Leave a Reply

Your email address will not be published. Required fields are marked *