[GRADLE-2495] Sporadic daemon connection issue on windows Created: 22/Sep/12  Updated: 17/Jan/13  Resolved: 26/Nov/12

Status: Resolved
Project: Gradle
Affects Version/s: None
Fix Version/s: 1.4-rc-1

Type: Bug
Reporter: Gradle Forums Assignee: Unassigned
Resolution: Fixed Votes: 1

Attachments: Zip Archive registry.bin.zip    

 Description   

I'm unable to connect to the Gradle daemon after restarting my computer while the daemon is running. I have not encountered this problem in the nightly build: 2012-09-03, so this might have been introduced recently.

Here is the stack trace of the thrown error: [1]https://gist.github.com/3765908

If you create a JIRA ticket from this topic, I can also attach the registry.bin to the issue (the daemon is not started by Gradle, so there is no log in the daemon directory for this issue).
----------------------------------------------------------------------------------------
[1] https://gist.github.com/3765908



 Comments   
Comment by Attila Kelemen [ 23/Sep/12 ]

I have attached the registry.bin file with which the problem occurs for me.

Comment by Attila Kelemen [ 23/Sep/12 ]

I think it is not possible to reproduce the issue with attached registry.bin.

However, I have found an easy way to reproduce the issue (that is, it works for me):

I was not able to reproduce the issue with JDK 6, so it might be Java dependent: I've used JDK 7 Update 5 to reproduce the issue. I will describe the steps to reproduce using my NetBeans plugin.

1. Download and install NetBeans 7.2
2. Download and install my plugin into NetBeans: https://github.com/kelemen/netbeans-gradle-project
3. Remove everything from the "daemon" directory of Gradle.
4. Configure the plugin at Tools/Options/Miscellaneous/Gradle:

  • Set the gradle installation folder.
  • Set the JDK to JDK 7 (preferably Update 5).
  • I usually add the JVM argument "-Xmx512m" but I believe this is not necessary.

5. Open a Gradle project (File/Open Project)
6. Wait until the project is completely opened (the progress is no longer displayed on the lower right corner of NetBeans)
7. Kill the process of the Gradle daemon.
8. Open another Gradle project (from another multi-project to avoid read project from cache).
9. Once the project has been "loaded", an exclamation mark should be displayed and hovering the mouse over it should display the exception (as returned by toString). The stacktrace of the exception can be found in the logs of NetBeans.

Of course, I believe it should be reproducible without this plugin if you set the Java home and the Gradle installation directory through the Tooling API.

Comment by Attila Kelemen [ 23/Sep/12 ]

Also, I would note that I cannot reproduce the issue by executing Gradle from the command line (using --daemon).

Comment by Attila Kelemen [ 23/Sep/12 ]

I have set verbose logging through the Tooling API and this is the only log I got:

21:23:58.050 [INFO] [org.gradle.tooling.internal.provider.DefaultConnection] Tooling API uses target gradle version: 1.3-20120917220018+0000.
21:23:58.077 [DEBUG] [org.gradle.cache.internal.DefaultFileLockManager] Waiting to acquire shared lock on daemon addresses registry.
21:23:58.077 [DEBUG] [org.gradle.cache.internal.DefaultFileLockManager] Lock acquired.
21:23:58.079 [DEBUG] [org.gradle.cache.internal.DefaultFileLockManager] Releasing lock on daemon addresses registry.
21:23:58.079 [DEBUG] [org.gradle.messaging.remote.internal.inet.TcpOutgoingConnector] Attempting to connect to [de98bc0d-a39a-4398-89b6-31276a8a6759 port:12686, addresses:[/127.0.0.1, /0:0:0:0:0:0:0:1]].
21:23:58.079 [DEBUG] [org.gradle.messaging.remote.internal.inet.TcpOutgoingConnector] Trying to connect to address /127.0.0.1.
21:23:59.081 [DEBUG] [org.gradle.messaging.remote.internal.inet.TcpOutgoingConnector] Cannot connect to address /127.0.0.1, skipping.
21:23:59.081 [DEBUG] [org.gradle.messaging.remote.internal.inet.TcpOutgoingConnector] Trying to connect to address /0:0:0:0:0:0:0:1.
21:23:59.081 [DEBUG] [org.gradle.messaging.remote.internal.inet.TcpOutgoingConnector] Cannot connect to address /0:0:0:0:0:0:0:1, skipping.

Comment by Attila Kelemen [ 23/Sep/12 ]

I have checked the sources of Gradle and I think I was able to track down the problem (although I don't know why it did not cause any error in previous releases):

1. TcpOutgoingConnector will cause a RuntimeException to be thrown if connection fails due to a SocketException.
2. Noone tries to recover from this RuntimeException, so no daemon instance will be attempted to be spawned (and no other daemons in the registry will be searched for).

A simple solution would be to replace the RuntimeException with ConnectException in line 66 of TcpOutgoingConnector.java.

Here is my proposed fix for this issue: https://gist.github.com/3773036. I have not checked this code at all (not even if it compiles), although I'm pretty confident

Comment by Szczepan Faber [ 24/Sep/12 ]

Thanks for the detailed report and the investigation! We'll look into it.

Comment by Attila Kelemen [ 15/Nov/12 ]

1.3-rc-1 still has the same problem. Here is the stacktrace when using 1.3-rc-1: https://gist.github.com/4080509

Comment by Szczepan Faber [ 16/Nov/12 ]

Hey Attila,

I know that fix seemed easy but it didn't make it to 1.3 It should be fixed in master, though, can you try the nightly?

Thanks for patience!

Comment by Attila Kelemen [ 16/Nov/12 ]

I will try it but give me some time to test it because it seems that how easy to reproduce the issue depends on the updates applied to Windows as well.

Besides this, I believe the current fix is imperfect, consider the following scenario:

1. SocketChannel.open in TcpOutgoingConnector.java throws a SocketException. This is to be expected since it is a checked exception and this is what happens in my case.
2. A RuntimeException will be thrown which is first catched in DefaultDaemonConnector.connectToDaemon/1 and is rethrown after "onFailure.run()".
3. This rethrown RuntimeException will not be catched anymore, so retries will be skipped (DaemonClient.execute/2).

I guess you are hesitant because you can't make SocketChannel.open to throw SocketException instead of java.net.ConnectException but I don't see they need to be treated any different and if you don't, there is no other code path to test and you haven't tested SocketException in the past anyway. The only problem I can see is that open may throw ClosedByInterruptException if the executing thread is interrupted and you might want to treat this particular exception differently if you use thread interruption for cancellation.

Comment by Attila Kelemen [ 20/Nov/12 ]

For some reason, Windows currently does not want to throw SocketException and I cannot reproduce the issue (not even in 1.3-rc-1) as I did before.

However, I tried the following:

1. Blocked all connections with a firewall.
2. Attempted to load a project.
3. IOException is thrown (instead of SocketException).

After these, it seems to me that Gradle still gives up after the first failure. I assume this because otherwise it used to throw an exception with something like "I gave up after 100 failed attempts".

I have also adjusted my proposal: https://gist.github.com/3773036 This now actually compiles but I did not try it otherwise. Is there a way to build and package gradle without executing any tests?

Comment by Szczepan Faber [ 21/Nov/12 ]

Hey Attila!

Thanks a lot for debugging. I paired on this issue with Adam yesterday and we converged on the solution. I just need to push it out

Generated at Wed Jun 30 12:23:30 CDT 2021 using Jira 8.4.2#804003-sha1:d21414fc212e3af190e92c2d2ac41299b89402cf.