Issues due to gzip compression of MachineConfig data on RedHat Openshift v4.12+

Introduction

Butane (current version as of writing this document v0.18.0)is a tool to create ignition configurations from text based configuration files. When running Redhat Openshift Butane can be used to create MachineConfig definitions to configure Red Hat Enterprise Linux CoreOS (RHCOS) running on the nodes for the configured MachineConfigPool objects of the cluster.

The problem

With the introduction of the version v4.10.0 of the openshift variant of the Butane specification gzip compression for the output of the file contents .spec.config.storage.files[*].contents.source has been introduced and enabled by default for inline and local file data.

For example the following Butane file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
variant: openshift
version: 4.12.0
metadata:
  name: 99-worker-registries
  labels:
    machineconfiguration.openshift.io/role: worker
storage:
  files:
  - path: /etc/containers/registries.conf
    mode: 0644
    overwrite: true
    contents:
      local: registries.conf

will result in the following yaml file being generated:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-registries
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
        - contents:
            compression: gzip
            source: data:;base64,H4sIAAAAAAAC/3TNsYrDMB......
          mode: 420
          overwrite: true
          path: /etc/containers/registries.conf

As you can see the source for the content of the file /etc/containers/registries.conf is gzip compressed and base64 encoded in the yaml file. The content is correctly saved to the source: field what can be verified by running: echo "H4sIAAAAAAAC/3TNsYrDMB......" | base64 -d |gunzip.

However if you apply the generated MachineConfig .yaml file (tested this on v4.13.12 but saw the same issue on v4.12 clusters as well) the affected MachineConfigPool will be stuck in degraded state:

1
2
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-e0c2e7a7ded5b2b4ae447b968e77b00c   False     True       True       1              0                   0                     1                      22d

Checking the status of the affected MachineConfigPool (oc get mcp master -o yaml) will show the following error message:

1
2
3
'Node sno01 is reporting: "failed decoding TOML content from file /etc/containers/registries.conf:
        toml: line 1: files cannot contain NULL bytes; probably using UTF-16; TOML
        files must be UTF-8"'

So obviously Openshift is having difficulties applying the MachineConfig of the source is gzip encrypted.

Circumvention

Unfortunately I did not find a solution for the issue so far (just found a similar bug 2032565 from end of 2021 on the RedHat site which states that the reported issue has been fixed).

To get the MachineConfig working I’ve determined the following circumvention namely to specify an older version of the openshift variant in the Butane file. I.e. instead of:

1
2
variant: openshift
version: 4.12.0

we specify

1
2
3
4
5
variant: openshift
####################################################
## Set version to 4.8.0 to prevent gzip of the data
####################################################
version: 4.8.0

in the Butane file.

This results then in the following MachineConfig file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Generated by Butane; do not edit
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 99-master-registries
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
        - contents:
            compression: ""
            source: data:;base64,dW5xdWFsaWZpZWQtc2VhcmNoLXJlZ......
          mode: 420
          overwrite: true
          path: /etc/containers/registries.conf

which we could successfully apply.

updatedupdated2023-09-132023-09-13