Share

A case of adapting a CLI test command from *NIX to Windows

There is a package I wanted to publish to npmjs.com. That package provides a CLI tool among other features, and contains tests for that tool. Unfortunately, tests were made with only *NIX in mind, so it wasn’t possible to pass all tests and publish the package from Windows.

The main culprit is inside the following function:

function runWithInputAndExpect (input, args, expectedOutput, done) {
  var command = 'echo "' + input.replace(/"/g, '\\"') + '" | node bin/cli.js ' + args;
  exec(command, function callback (error, stdout, stderr) {
    expect(error).to.be.a('null');
    expect(stderr).to.equal('');
    expect(stdout).to.equal(expectedOutput + '\n');
    done(error);
  });
}

What makes the difference between *NIX and Windows (as far as we concerned here) is how echo command works. First of all, it keeps the quotes! But after we try to remove them - we will soon discover that the rabbit hole might be quite deep.

And down the rabbit hole we go:


By the way, good ol’ CMD is used here and in many other places in Windows. PowerShell echo (alias to Write-Output) works much closer to *NIX echo, but calling PowerShell seems like a needless complication, so we are not considering that.

Preparing for the journey

In order to simplify following examples, let’s make a minimal example that reproduces our issue.

We need something in place of node bin/cli.js that will just reply with what it got from stdin to stdout. Let’s make dump.bat file with following content:

@echo off
find /v ""

find command is repurposed here to match and return every non-empty line, which is good enough for our task.

1. Quoted string

Now our issue:

C:\test> echo "hello world" | dump.bat
"hello world"

You can see quotes were passed through as if they are a part of input. Tests were failing because expected output has no quotes.

2. Let’s remove quotes

C:\test> echo hello world | dump.bat
hello world

So far so good. But some tests are failing with various obscure errors…

C:\test> echo Hello <img alt="alt text" src="http://my.img/here.jpg">! | dump.bat
The system cannot find the file specified.
C:\test> echo <a href="http://my.link">test</a> | dump.bat
| was unexpected at this time.

3. Special characters

It is time to remember that certain characters have special meaning in CMD console. There is a good summary, although we only care about a small subset of them.

Our main concern are < and > which, in CMD, serve as input and output redirection from/to a file. Thus the errors about “cannot find the file”. And the second error is because there is a > redirection followed by | pipeline instead of anything that could be interpreted as a file name.

What can we do? We can try to escape these characters as ^< and ^>.

C:\test> echo Hello ^<img alt="alt text" src="http://my.img/here.jpg"^>!
Hello <img alt="alt text" src="http://my.img/here.jpg">!
C:\test> echo ^<a href="http://my.link"^>test^</a^>
<a href="http://my.link">test</a>

So far so good. But you may notice I’ve dropped the pipeline here. Let’s bring it back.

C:\test> echo Hello ^<img alt="alt text" src="http://my.img/here.jpg"^>! | dump.bat
The system cannot find the file specified.
C:\test> echo ^<a href="http://my.link"^>test^</a^> | dump.bat
The syntax of the command is incorrect.

Oops. It doesn’t work the way we might expect…

4. Other way to produce output without quotes

We still trying to send a predefined output without quotes to a pipeline. Is there other way to do it?

Turns out yes, there are several workarounds for this common issue. But not all of them are applicable in our case.

If we were writing a batch script instead of a single command - there are examples of “DeQuote” helper script. Let’s leave this out unless we will be absolutely desperate.

We might be able to produce a command that will remove all the quotes from a string with %var:x=y% replacement syntax. But since we need to keep inner quotes in our case - it has no use for us to explore further.

Then I found this rather creative workaround:

C:\test> echo | set /p="Hello, <span>World</span>!" | dump.bat
Hello, <span>World</span>!

set command with /P key is used to set input prompt and wait for user input (typically to assign to a variable). But here we feed whatever echo produces (“ECHO is on.”) to it’s input instead as we don’t care to set any variable. The net result is the required string printed to the output without quotes.

Happy end? Well, not yet…

C:\test> echo | set /p="    Hello, <span>World</span>!" | dump.bat
Hello, <span>World</span>!

For whatever reason, on all recent versions of Windows the prompt is trimmed of the leading spaces. This causes some tests to fail again…

5. Trying to preserve leading spaces

If we search for a solution for missing leading spaces, we can find a common workaround - add a non-space character at the beginning and then the backspace character to erase it. The following spaces will be preserved.

It might be tricky to obtain the backspace character in CMD, but luckily we are calling it from JavaScript, and here we have a special sequence for backspace character - \b.

var command = 'echo | set /p=".\b    Hello, <span>World</span>!" | dump.bat';

The result will be visually identical to what we looking for, but tests will still fail. Why? Because the two extra symbols are still passed around and will participate in all string operations.

6. Escape the escaped

We are gitting further and further away from the initial problem. Let’s get back to step 3 and think more about what happens with escaping.

We should keep in mind how the ^ symbol is processed in CMD and what happens at the | border.

Every time the CMD input is parsed, following is happening:

Look at each character from left to right:

If a caret (^), the next character is escaped, and the escaping caret is removed. Escaped characters lose all special meaning (except for <LF>).

Right side of the pipeline gets the string without the caret at the input, but then it gets through parser again…

What can we do about this? Add another layer of escaping. This looks like adding ^^^ instead of ^. After the first parsing round ^^ is replaced with ^ and ^< is replaced with <. So it can go through the second parsing round.

C:\test> echo.  Hello ^^^<img alt="alt text" src="http://my.img/here.jpg"^^^>! | dump.bat
  Hello <img alt="alt text" src="http://my.img/here.jpg">!
C:\test> echo.  ^^^<a href="http://my.link"^^^>test^^^</a^^^> | dump.bat
  <a href="http://my.link">test</a>

The dot after echo allows to keep the leading spaces.

And this is the working solution, finally.

Updated JavaScript code looks like this:

var isWin = process.platform === "win32";

function runWithInputAndExpect(input, args, expectedOutput, done) {
  var command = isWin
    ? 'echo.' + input.replace(/[<>]/g, '^^^$&') + ' | node bin/cli.js ' + args
    : 'echo "' + input.replace(/"/g, '\\"') + '" | node bin/cli.js ' + args;
  exec(command, function callback(error, stdout, stderr) {
    expect(error).to.be.a('null');
    expect(stderr).to.equal('');
    expect(stdout).to.equal(expectedOutput + '\n');
    done(error);
  });
}

7. Extra solution

While writing this post, I found another working solution using variables.

C:\test> set "str=  ^<span^>hello^</span^> "world"" & echo %^str% | dump.bat
  <span>hello</span> "world"
C:\test> set str="  ^<span^>hello^</span^> "world""& echo %^str:~1,-1% | dump.bat
  <span>hello</span> "world"

Here, %^var% syntax allows to prevent variable expansion until the next parsing round, similar to escaped strings.

This example also illustrates how a variable can be defined without or with quotes included (pay attention to the first quote location), and how substring syntax can be used to exclude opening and closing quotes from a variable. Note that it’s important to leave no spaces before & sign in the second case - they are added to the variable for whatever reason.

If only this solution didn’t require any escaping - it would’ve probably been the more robust and preferrable one. But since we still have to pay attention to what characters have to be escaped - previous solution is the preferrable one in my opinion - it’s simpler with one command less, and we don’t have to worry about doing variables right.

comments powered by Disqus