On golang os.Expand and os.ExpandEnv

They are awesome but could be better

If you are using golang, there’s a pretty high chance you have used the os.Expand(s string, mapping func(string) string) function in your code already. Or maybe it’s derivative, os.ExpandEnv(s string).

The former takes an input string and expands the shell variable-like occurrences with actual shell variable values. For example:

1
2
3
os.Setenv("VARIABLE", "hello")
fmt.Println(os.ExpandEnv("${VARIABLE}, world!"))
// prints "hello, world!"

It uses os.Lookup(s string) as the mapping argument to os.Expand.

Pretty often, that may be what is needed and os.ExpandEnv is one of little gems of the golang standard library.

The problem with os.ExpandEnv is, if the variable referenced in the string does not exist, it’s replaced with an empty string.

However, consider the following command:

1
2
3
4
5
6
7
8
9
const command = `cd /tmp/build && \
apkArch="$(apk --print-arch)" && \
case "${apkArch}" in \
    aarch64) consulArch='arm64' ;; \
    armhf) consulArch='armhfv6' ;; \
    x86) consulArch='386' ;; \
    x86_64) consulArch='amd64' ;; \
    *) echo >&2 "error: unsupported architecture: ${apkArch} (see ${HASHICORP_RELEASES}/consul/${CONSUL_VERSION}/)" && exit 1 ;; \
esac`

Assuming that the values of HASHICORP_RELEASES and CONSUL_VERSION are passed as environment variables:

1
2
3
os.Setenv("HASHICORP_RELEASES", "https://releases.hashicorp.com")
os.Setenv("CONSUL_VERSION", "1.9.4")
fmt.Println(os.ExpandEnv(command))

would give the following output:

cd /tmp/build && \
apkArch="$(apk --print-arch)" && \
case "" in \
    aarch64) consulArch='arm64' ;; \
    armhf) consulArch='armhfv6' ;; \
    x86) consulArch='386' ;; \
    x86_64) consulArch='amd64' ;; \
    *) echo >&2 "error: unsupported architecture:  (see https://releases.hashicorp.com/consul/1.9.4/)" && exit 1 ;; \
esac

The ${apkArch} part was obliterated from the output. This command would never work.

Fortunately, os.Expand comes to the rescue!

1
2
3
4
5
6
7
8
lookupFunc := func(placeholderName string) string {
    if value, ok := os.Lookup(placeholderName); ok {
        return value
    }
    // fallback:
    return fmt.Sprintf("$%s", placeholderName)
}
fmt.Println(os.Expand(command, lookupFunc))

Aha, now it looks better:

cd /tmp/build && \
apkArch="$(apk --print-arch)" && \
case "$apkArch" in \
    aarch64) consulArch='arm64' ;; \
    armhf) consulArch='armhfv6' ;; \
    x86) consulArch='386' ;; \
    x86_64) consulArch='amd64' ;; \
    *) echo >&2 "error: unsupported architecture: $apkArch (see https://releases.hashicorp.com/consul/1.9.4/)" && exit 1 ;; \
esac

This output would definitely work. But shell strings can be much more complicated than this.

The problem with the lookupFunc is that one has to make an upfront decision to surround the fallback with {}.

And there are cases when neither is the right choice.

Consider the following input, a slightly modified real example coming from the official Postgres 13 Dockerfile:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const command = `RUN set -eux; \
    export GNUPGHOME=${GNUPGHOME:=$(mktemp -d)}; \
	savedAptMark="$(apt-mark showmanual)"; \
	apt-get update; \
	apt-get install -y --no-install-recommends ca-certificates wget; \
	rm -rf /var/lib/apt/lists/*; \
	dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')"; \
	wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch"; \
	wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch.asc"; \
	gpg --batch --keyserver hkps://keys.openpgp.org --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4; \
	gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu; \
	gpgconf --kill all; \
	rm -rf "$GNUPGHOME" /usr/local/bin/gosu.asc; \
	apt-mark auto '.*' > /dev/null; \
	[ -z "$savedAptMark" ] || apt-mark manual $savedAptMark > /dev/null; \
	apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; \
	chmod +x /usr/local/bin/gosu; \
	gosu --version; \
	gosu nobody true
`

There are two conflicting cases in this input: export GNUPGHOME=${GNUPGHOME:=$(mktemp -d)}; and | awk -F- '{ print $NF }'.

In case of the export command, the surrounding {} must be preserved. The lookupFunc could return the fallback of fmt.Sprintf("${%s}", placeholderName).

But in case of | awk -F- '{ print $NF }', surrounding $NF with {} results in an error. The mapper argument of os.Expand fails to tell what the raw input was. Can this be fixed?

The answer is to look at the source of os.Expand standard library function. It looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
func Expand(s string, mapping func(string) string) string {
	var buf []byte
	// ${} is all ASCII, so bytes are fine for this operation.
	i := 0
	for j := 0; j < len(s); j++ {
		if s[j] == '$' && j+1 < len(s) {
			if buf == nil {
				buf = make([]byte, 0, 2*len(s))
			}
			buf = append(buf, s[i:j]...)
			name, w := getShellName(s[j+1:])
			if name == "" && w > 0 {
				// Encountered invalid syntax; eat the
				// characters.
			} else if name == "" {
				// Valid syntax, but $ was not followed by a
				// name. Leave the dollar character untouched.
				buf = append(buf, s[j])
			} else {
				buf = append(buf, mapping(name)...)
			}
			j += w
			i = j + 1
		}
	}
	if buf == nil {
		return s
	}
	return string(buf) + s[i:]
}

The case we are interested in is the final else. It says:

if it was a valid shell variable name, replace the value with the value from the mapper

If we replaced this code with:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
func Expand(s string, mapping func(string) (string, bool)) string {
	var buf []byte
	// ${} is all ASCII, so bytes are fine for this operation.
	i := 0
	for j := 0; j < len(s); j++ {
		if s[j] == '$' && j+1 < len(s) {
			if buf == nil {
				buf = make([]byte, 0, 2*len(s))
			}
			buf = append(buf, s[i:j]...)
			shellNameInput := s[j+1:]
			name, w := getShellName(shellNameInput)
			if name == "" && w > 0 {
				// Encountered invalid syntax; eat the
				// characters.
			} else if name == "" {
				// Valid syntax, but $ was not followed by a
				// name. Leave the dollar character untouched.
				buf = append(buf, s[j])
			} else {
				replacement, ok := mapping(name)
				if ok {
					buf = append(buf, replacement...)
				} else {
					// preserve enclosing {}
					if shellNameInput[0] == '{' {
						buf = append(buf, fmt.Sprintf("${%s}", name)...)
					} else {
						buf = append(buf, fmt.Sprintf("$%s", name)...)
					}
				}
			}
			j += w
			i = j + 1
		}
	}
	if buf == nil {
		return s
	}
	return string(buf) + s[i:]
}

We would get the fully correct behavior. We have added:

1
2
3
			buf = append(buf, s[i:j]...)
			shellNameInput := s[j+1:] // <----- this line
			name, w := getShellName(shellNameInput)

and changed the final else statement to:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
				replacement, ok := mapping(name)
				if ok {
					buf = append(buf, replacement...)
				} else {
					// preserve enclosing {}
					if shellNameInput[0] == '{' {
						buf = append(buf, fmt.Sprintf("${%s}", name)...)
					} else {
						buf = append(buf, fmt.Sprintf("$%s", name)...)
					}
				}

This bit reads as follows:

if the mapper found the value, use it; otherwise fall back to the original value but preserve surrounding braces

The result of the custom Expand:

1
2
os.Setenv("GOSU_VERSION", "1.12")
fmt.Println(Expand(command, os.Lookup))

is correct:

RUN set -eux; \
    export GNUPGHOME=${GNUPGHOME:=$(mktemp -d)}; \
	savedAptMark="$(apt-mark showmanual)"; \
	apt-get update; \
	apt-get install -y --no-install-recommends ca-certificates wget; \
	rm -rf /var/lib/apt/lists/*; \
	dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')"; \
	wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/1.12/gosu-$dpkgArch"; \
	wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/1.12/gosu-$dpkgArch.asc"; \
	gpg --batch --keyserver hkps://keys.openpgp.org --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4; \
	gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu; \
	gpgconf --kill all; \
	rm -rf "$GNUPGHOME" /usr/local/bin/gosu.asc; \
	apt-mark auto '.*' > /dev/null; \
	[ -z "$savedAptMark" ] || apt-mark manual $savedAptMark > /dev/null; \
	apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; \
	chmod +x /usr/local/bin/gosu; \
	gosu --version; \
	gosu nobody true

A full implementation is here.