Git Aliases & Shell
Today I took a look at one particular git repository’s configuration and saw something slightly off in the configuration for a credential helper, dating from an old experiment with AWS CodeCommit. I decided to dig deeper to figure out what the actual rules are for shell commands inside git configuration files.
This side-diversion took a bit longer than expected. It’s a Sunday. Ah well. I’ve seen too much cargo-culted incorrect information online, so it was time to figure out an accurate answer. If you see Leaning Toothpick Syndrome markedly increasing the count of backslashes, then a little suspicion is warranted.
In fairness to those who are confused, including me before today: the git
logic tries to implement magic “do the right thing” fix-ups at a level beneath
the GIT_TRACE reporting, so that it’s pretty hidden until you either read
the source or use a strace(1) style of tool. This makes the simple cases
right, even when they’re wrong, but makes the fancier cases obtuse.
I decided to focus on looking into how [alias] rules are handled, on the
(correct) assumption that it’s the same mechanisms involved.
For clarity: avoid cleverness inside configuration files!
If the steps stated here matter, then the complexity really should be moved
out of ~/.gitconfig and into a separate helper script. When you start
having to worry about where strings are parsed and how they’re handled in
which situations, the correct response is to back away slowly, not breaking
eye-contact, and create a very short script to handle things instead.
Notwithstanding correctness, sometimes you need to know how to tame the beast.
git config documentation
Let’s start by reading the actual documentation; this is found in
git-config(5):
alias.*
Command aliases for thegit(1)command wrapper - e.g. after definingalias.last = cat-file commit HEAD, the invocationgit lastis equivalent togit cat-file commit HEAD. To avoid confusion and troubles with script usage, aliases that hide existing Git commands are ignored. Arguments are split by spaces, the usual shell quoting and escaping is supported. A quote pair or a backslash can be used to quote them.If the alias expansion is prefixed with an exclamation point, it will be treated as a shell command. For example, defining
alias.new = !gitk --all --not ORIG_HEAD, the invocationgit newis equivalent to running the shell commandgitk --all --not ORIG_HEAD. Note that shell commands will be executed from the top-level directory of a repository, which may not necessarily be the current directory. GIT_PREFIX is set as returned by running git rev-parse –show-prefix from the original current directory. See git-rev-parse(1).
That’s a decent start but things go wrong mysteriously when we try to do
something non-trivial. A simple for-loop appears to fail at the "$@"
stage, but why?
Also under “Syntax”:
The following escape sequences (beside
\"and\\) are recognized:\nfor newline character (NL),\tfor horizontal tabulation (HT, TAB) and\bfor backspace (BS). Other char escape sequences (including octal escape sequences) are invalid.
Shell side-digression
Something which will become important below is a quirky corner of POSIX shell
handling. Despite POSIX giving us the convention of -- to terminate option
processing and treat following parameters as non-option entries, when you
invoke sh -c FOO -- arg1 you are not using -- to terminate options.
Instead, per the specification,
in -c handling, the first non-option parameter is the string providing the
input to be parsed for shell commands (here FOO) and the next parameter
provides the name of the command! It’s argv[0] pass-through.
Thus sh -c FOO -- arg1 will invoke a shell, with $0 equal to --, $1 of
arg1 and then parse and handle FOO.
After the first non-option, no other strings are examined to see if they might be options; there is no permutation.
Thus sh -c FOO BAR -u does not risk -u telling the shell to treat unset
variable references as errors. Instead, FOO is invoked in a context
where argv=["BAR", "-u"].
What actually happens with git
Git’s config parser and the shell-or-other invocation are the two layers to worry about.
The config parser uses both ; and # as comment markers, so an ; if
unquoted will terminate things. Given that ; is a sub-list terminator in
shell syntax, this is a little unfortunate.
A shell is not necessarily used when an alias is treated as a shell command. That’s very careful documentation wording, above. “it will be treated as a shell command” does not mean that a shell will be used, merely that it’s a shell command to be handled. Git will often resort to using a shell, but it’s not a commitment to do so.
Git parses the configuration file handling basic stripping of comments and
handling of the \n substitutions and quoting, before the alias mechanisms
come into play.
Git can split a string into separate fields, for invoking as a command,
without needing to go near a shell to do so. The alias.c:split_cmdline()
function handles this, splitting on whitespace while not splitting within
quoted strings. It handles single and double-quotes, with double-quotes
supporting a backslash escape purely for avoiding breaking out of quoted
state. No other escapes are supported, but the configuration parsing has
already handled some. You can have:
[alias]
foo = !"bar\nbaz"and the value of stored for the alias foo will be:
!bar
bazInstead, the quotes handling for split_cmdline are for quotes inside the
original quotes; this is what lets us write:
[alias]
foo = "!printf '%s\\n' first \"sec ond\" third"and invoke:
% GIT_TRACE=true git foo
20:50:18.430074 git.c:654 trace: exec: git-foo
20:50:18.430779 run-command.c:637 trace: run_command: git-foo
20:50:18.432151 run-command.c:637 trace: run_command: 'printf '\''%s\n'\'' first "sec ond" third'
first
sec ond
thirdNote here that we used \\ to avoid having the git config parser handle the
\n. A strace(1) shows us:
% strace -ff git foo
[...]
[pid 16724] execve("/bin/sh", ["/bin/sh", "-c", "printf '%s\\n' first \"sec ond\" th"..., "printf '%s\\n' first \"sec ond\" th"...], [/* 62 vars */] <unfinished ...>
[...]
% strace -ff sh -c "printf '%s\n'" first 'sec ond' third
execve("/bin/sh", ["sh", "-c", "printf '%s\\n'", "first", "sec ond", "third"], [/* 61 vars */]) = 0Here, the sequence \\n is simply how strace is showing that \n as a
two-character sequence is the string being passed through.
Returning to Git’s codebase:
Without an exclamation mark, split_cmdline() is used and the results put
into the current process’s argv[] vector after the initial git, and
argument processing effectively then restarts, letting git decide again what
should be done.
Without an exclamation mark, that’s it. All done, quick and clean and simple. Everything after here assumes that the entry starts with an exclamation mark.
What the exclamation mark means is really “run this entry as a sub-command
now, and exit”. This is found in git.c:handle_alias().
The invocation is then run_command() on a struct with .use_shell set.
The entire string of the alias value is passed into this as a single string,
no whitespace handling. split_cmdline() does not apply.
But just having .use_shell set still doesn’t mean that a shell is used.
What it means is that run-command.c:prepare_shell_cmd() will be used to
construct the shell command, which might mean that a shell will be used.
What prepare_shell_cmd() does is look to see if the value of the command
might be a single word without any special characters. Those special
characters are backtick itself or any of: |&;<>()$\"' \t\n*?[#~=%
Since a whitespace is in the list of characters, simply having two words is enough to trigger invoking the shell.
Once the shell is invoked, the exact form invoked depends upon whether or not the invoker of the command provided parameters on the command-line.
In all cases, the shell is invoked with at least four parameters in argv.
The first two are ["sh", "-c"]. The third will be the string from the
configuration, if and only if no parameters were supplied to git by the
invoker. If parameters were provided, then instead the third parameter for
the shell is modified, by adding the five characters "$@" to the end of it.
(Five: SPACE, QUOTATION MARK, DOLLAR SIGN, COMMERCIAL AT, QUOTATION MARK.)
Git tries to be clever and assumes that you’ll want the arguments available at
the end of whatever string is given.
This works for simple commands. But for a text which tries to handle arguments itself, it’s a hindrance to be worked around.
The fourth of the always-present parameters for the shell is the text from the alias definition. Repeated, as the name of the shell.
Any elements in the new shell’s argv after that are passed through from the
invoker of git.
What this means for us
Handling the auto-inserted "$@" is simple enough, once you know that it
needs to be handled: simply end your alias definition with a shell comment
character, the octothorpe # (Unicode NUMBER SIGN).
This is correct alias definition for ~/.gitconfig:
[alias]
wibble = "!set -x\necho $#\nfor x in \"$@\"; do echo \": {$x}\"; done #"
wobble = "!for x in \"$@\"; do echo \": {$x}\"; done #"Here, the git configuration file parsing has resulted in the list of aliases
containing an entry for wibble and one for wobble, where the stored
strings are:
!set -x
echo $#
for x in "$@"; do echo ": {$x}"; done #and the same but without the first two lines.
We can invoke git wibble:
% git wibble
+ echo 0
0
% git wibble foo 'bar baz'
+ echo 2
2
+ for x in '"$@"'
+ echo ': {foo}'
: {foo}
+ for x in '"$@"'
+ echo ': {bar baz}'
: {bar baz}Those two invocations using git wobble instead (to make this a little more
condensed and easier to scan) would have resulted in these argv arrays being
processed (using single-quotes for strings to avoid introducing backslashes):
first = [
'sh',
'-c',
'for x in "$@"; do echo ": {$x}"; done #',
'for x in "$@"; do echo ": {$x}"; done #', # ignored, shell $0
NULL,
]
second = [
'sh',
'-c',
'for x in "$@"; do echo ": {$x}"; done # "$@"',
'for x in "$@"; do echo ": {$x}"; done #', # ignored, shell $0
'foo', # $1
'bar baz', # $2
NULL,
]Credential Helpers
The AWS documentation currently up at https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-https-unixes.html has you run:
git config --global credential.helper '!aws codecommit credential-helper $@'I’ve not used CodeCommit in a while and am not setting it up again now to
confirm, but I believe this is wrong. The same mechanisms are in play for how
git invokes commands, (except that without an exclamation mark, if not given a
complete path, the command tried for "foo" will be "git credential-foo")
so you’ll end up with git invoking:
['sh', '-c',
'aws codecommit credential-helper $@ "$@"',
'aws codecommit credential-helper $@', # ignored, shell $0
'get']and the shell then invoking:
['aws', 'codecommit', 'credential-helper', 'get', 'get']Clearly the aws codecommit credential-helper is ignoring extraneous
parameters, and relying upon the available sub-commands being any of
["get", "store", "erase"], none of which contain whitespace, so $@
degenerating to $* here is harmless.
Conclusion
It’s easier than expected to have arbitrary shell in a git alias, you just
need to know about the undocumented implicit sometimes-added "$@". That
still doesn’t mean you should do so.
-The Grumpy Troll