[GRADLE-3329] Exec output with long lines containing multibyte UTF-8 broken Created: 04/Aug/15  Updated: 01/Oct/16  Resolved: 01/Oct/16

Status: Resolved
Project: Gradle
Affects Version/s: None
Fix Version/s: 3.2-rc-1

Type: Bug
Reporter: Lóránt Pintér Assignee: Lari Hotari
Resolution: Fixed Votes: 2


 Description   

When output from an `Exec` or a `JavaExec` task contains a long line containing multibyte UTF-8 characters, these multibyte characters can be broken up into parts and corrupt the actual output to System.out.

This is caused by the task's output going through ExecOutputHandleRunner that breaks up the output of the task into 2048-byte chunks and passes them to LineBufferingOutputStream. Because ExecOutputHandleRunner flushes the output after every 2048-byte chunk, the chunks get converted to UTF-8 strings individually, and if a multi-byte character ends up on the divider, it gets corrupted.

Original report:
https://discuss.gradle.org/t/javaexec-task-mangles-utf-8-output/10975



 Comments   
Comment by Lóránt Pintér [ 04/Aug/15 ]

Added test: https://github.com/gradle/gradle/commit/04ba40cb205fda1b171989b10849577348cf8787

Comment by Lóránt Pintér [ 04/Aug/15 ]

A fix could be to only flush in ExecOutputHandleRunner at the end. This might cause a problem if we use this with a stream that doesn't flush regularly, and thus output would not appear regularly. My hunch is that we always use LineBufferingOutputStream, but need to check.

Comment by Blane Dabney [ 04/Aug/15 ]

It looks like the corruption occurs here: https://github.com/gradle/gradle/blob/master/subprojects/core/src/main/groovy/org/gradle/util/LineBufferingOutputStream.java#L100

    public void flush() {
        if (count != 0) {
            handler.text(new String(buf, 0, count));
        }
        reset();
    }
Comment by Blane Dabney [ 04/Aug/15 ]

As far as I can tell, the only reason this gets converted to a String is for logging purposes. The only implementation I could find of TextStream is the private ForwardTextStreamToConnection class within https://github.com/gradle/gradle/blob/master/subprojects/launcher/src/main/java/org/gradle/launcher/daemon/client/DaemonClientInputForwarder.java .

The text() function only requires a String argument so that it can escape newlines when doing debug logging. It immediately converts it back to bytes and wraps them in a ForwardInput object to pass to the dispatcher. This could be refactored to accept a byte[] directly and only instantiate a String from that when doing debug logging. In that case, a UTF-8 encoding issue is probably a negligible problem.

Comment by Blane Dabney [ 04/Aug/15 ]

Looks like https://github.com/gradle/gradle/blob/master/subprojects/core/src/main/groovy/org/gradle/logging/internal/DefaultStandardOutputRedirector.java has an implementation of TextStream too, so I probably spoke too soon there.

Comment by Andreas Braumann [ 08/Oct/15 ]

Hi i just stumbled across this error and would like to know when I can expect this to appear in the Nightly Builds?

Comment by Lari Hotari [ 01/Oct/16 ]

Fixed by https://github.com/gradle/gradle/commit/1746300a

Generated at Wed Jun 30 12:45:57 CDT 2021 using Jira 8.4.2#804003-sha1:d21414fc212e3af190e92c2d2ac41299b89402cf.