[GRADLE-2444] Gradle daemon becomes unusable Created: 23/Aug/12 Updated: 04/Jan/13 Resolved: 16/Oct/12 |
|
Status: | Resolved |
Project: | Gradle |
Affects Version/s: | None |
Fix Version/s: | 1.3-rc-1 |
Type: | Bug | ||
Reporter: | Gradle Forums | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Attachments: | console.debug.zip daemon.from.commandline.zip daemon.zip daemon2.zip gradle.console.zip logs.from.commandline.zip |
Description |
After using the gradle daemon a lot, eventually it becomes unusable. By this I mean, that when I try to use it, it will wait a long time and eventually reports an error that it have tried to connect 100 different Gradle daemon without success. Usually there are no java processes are running (though sometimes there are java process executing the Gradle daemon) and deleting the "daemon" directory in the ".gradle" directory solves the problem. This problem is rather frustrating because the Tooling api requires me to use the Gradle daemon. That is, I see no way to avoid using it. I believe I currently use the binaries for the Tooling api found in Gradle 1.0. It seems, that it does not matter what version (1.0 or 1.1) I specify using the `GradleConnector.useInstallation` method. |
Comments |
Comment by Gradle Forums [ 23/Aug/12 ] |
Please provide the exact error message and stack trace. Which OS? How are you leveraging the daemon/tooling API? |
Comment by Gradle Forums [ 23/Aug/12 ] |
Currently I cannot give you a stacktrace because regretably I did not save it. Next time it occurs, I will. Note however, that this problem does not occur very frequently but once it does, the problem will persist until the daemon folder is deleted (not even system restart helps). I have seen this issue on Windows systems (7, Vista, Xp) and not tried with others. I use the tooling API to provide a plugin for NetBeans: [1]https://github.com/kelemen/netbeans-g... All code relevant to this issue can be found here: [2]https://github.com/kelemen/netbeans-g... |
Comment by Gradle Forums [ 23/Aug/12 ] |
Luckily, I found the stack trace in the log of NetBeans: org.gradle.launcher.daemon.client.NoUsableDaemonFoundException: Unable to find a usable idle daemon. I have connected to 100 different daemons but I could not use any of them to run build: Build {id=ee7502e0-aaa5-4be1-b0bd-e86870f7d8ac.1, currentDir=C:\Program Files\NetBeans 7.2}. |
Comment by Gradle Forums [ 23/Aug/12 ] |
Attila, do you think it would be possible to reproduce the issue in reasonable time? Can you provide the steps? When the problem occurs next time, before deleting the daemon folder can you zip it up and attach it to the jira ticket (I'll create the ticket). The daemon folder should contain logs from the daemons. |
Comment by Attila Kelemen [ 23/Aug/12 ] |
I have attached the daemon folder when the problem occurred. |
Comment by Attila Kelemen [ 24/Aug/12 ] |
I have attached what Gradle wrote to the console when using "-s -debug". I did is by running Gradle from NetBeans however (because for some reason, the problem did not appear when running from the command line, though it usually does). It simply tries to connect to a bunch of ports on the localhost to which it cannot (obviously, since no process named "java" is running, so there is no daemon). The error message "A létező kapcsolatot a távoli állomás kényszerítetten bezárta" means that "The existing connection has been forcibly closed by the remote host". I assume that this is WSAECONNRESET (10054). |
Comment by Szczepan Faber [ 24/Aug/12 ] |
Thanks a lot for the logs. I'll schedule some time next week to dig this problem further. I'll get back. Cheers! |
Comment by Szczepan Faber [ 27/Aug/12 ] |
Hey Attila, The gradle console output does not seem to be a debug output. Are you sure you've run with --debug? It would be very useful to see the debug log. If you happen to reproduce the issue with --debug please also attach the matching daemon logs (from the daemon folder). Thanks a lot for help! |
Comment by Attila Kelemen [ 27/Aug/12 ] |
I think I have misread it and used "-debug", sorry. I'll do a different run as soon as I'm able. |
Comment by Attila Kelemen [ 28/Aug/12 ] |
I have executed Gradle with " What I have tried:
|
Comment by Szczepan Faber [ 28/Aug/12 ] |
Thanks a lot for info. If the problem occurs, can you attach the zip with the corrupted daemon folder? For logging, when you use the tooling api in the NB plugin, can you temporarily set the verbose logging? You need to cast the GradleConnector to DefaultGradleConnector and use setVerboseLogging() method. >I have executed Gradle with "s" and "-debug" argument with the Tooling API but it did not write anything else You can try using the methods on the LongRunningOperation to capture the standard output / standard error of the tooling api execution. Tell me if it helps getting the logs. >What is the difference from calling the daemon from the command line or from the Tooling API? The difference is that the client jar is embedded in the host application. However, it shouldn't be problem. Depending on how you configure the daemon via the GradleConnector, they might use different Gradle version, different gradle user home, etc. Can you tell me what happens if you run builds from command line when the problem starts occurring? If command line Gradle uses the same version / gradle user home as your embedded implementation then you should be able to reuse the same daemon that is used in NB. This should give us a way to reproduce the problem from command line and get a nice --debug log. Thanks a lot for helping out! |
Comment by Attila Kelemen [ 28/Aug/12 ] |
The already attached daemon folder reproduces the problem for me but seeing the debug output, it is very likely that it will only reproduce for me. These are the additional logs being emitted with verbose logging:
As you can see, I have multiple JDK installed and I use different JDK with NetBeans than the one specified by JAVA_HOME (this is unintentional but I think, this could cause the problem for Gradle). |
Comment by Attila Kelemen [ 28/Aug/12 ] |
>Can you tell me what happens if you run builds from command line when the problem starts occurring? If command line Gradle uses the same version / gradle user home as your embedded >implementation then you should be able to reuse the same daemon that is used in NB. This should give us a way to reproduce the problem from command line and get a nice --debug log. I could not reproduce it from the command line for long time, so I cannot tell you what happens with "--debug" but otherwise it is completely the same. Waits a lot then prints that it has failed after trying a hundred times. I don't think I will be able to reuse the Gradle daemon because when the problem appears, there is no java processes at all, so there is no daemon to reuse. |
Comment by Szczepan Faber [ 28/Aug/12 ] |
Different java_homes are not a problem. The thing is that you can connect to the existing idle daemon only if your java home / java args match. If they don't match then new daemon will be forked. If you make the java home consistent you should be able to reuse the same daemon and maybe reproduce the issue from the command line. The above log is when the problem occurs? Can you attach a complete one? Above is not conclusive because looking for a different daemon due to a different java is a regular situation and shouldn't be a problem. |
Comment by Attila Kelemen [ 28/Aug/12 ] |
I have attached the debug logs written by Gradle and the daemon folder which I currently use to reproduce the issue. Currently if I run gradle with --daemon from the command line, I cannot reproduce this issue, even though if I run gradle from NetBeans, it does occur. I use the same, installed Gradle distribution for both cases. |
Comment by Attila Kelemen [ 28/Aug/12 ] |
There is actually a difference when I'm executing Gradle from the command line: I still use the tooling api from Gradle 1.0 in NetBeans while in the command line, I use 1.1. |
Comment by Attila Kelemen [ 28/Aug/12 ] |
I was able to reproduce it from the command line and attached its logs. It does not seem to be much different, except that it has a different call stack. |
Comment by Attila Kelemen [ 28/Aug/12 ] |
I have also attached my daemon directory with which I was able to reproduce the problem from the command line. Hopefully with this, you might be able to reproduce it as wel.. |
Comment by Szczepan Faber [ 28/Aug/12 ] |
Thanks a bunch! I'll take look. |
Comment by Szczepan Faber [ 28/Aug/12 ] |
From what I read in the logs here's what happens: no daemon is running but still the client attempts to connect on some address found in the registry. The address is 'connectable' but the daemon on the other side of the wire does not exist (or unlikely: died before writing anything to its log). The problem is reported on windows platform. I haven't reproduced the issue but I wasn't trying very hard, either. Here're some options: 1. If we receive an exception (IOException?) when sending build request to the daemon we: I'm tempted to go with 1.1 & try harder to reproduce the issue. It would be very useful to find out why the daemon gets corrupted or why does it disappear (no java process) but it is still connectable? |
Comment by Attila Kelemen [ 28/Aug/12 ] |
My current guess is that it gets corrupted because when I shutdown Windows and the daemon is still running, possibly Windows does not allow it to write to its file (perhaps because it is a file in the user dir and the daemon gets killed after log out?) |
Comment by Attila Kelemen [ 29/Aug/12 ] |
I did a little investigation: I tried to connect to the port where Gradle believes that the daemon should listen for connections. From a 32 bit app, connection attempt to this port fails. From a 64 bit app, connection succeeds (but reading from it is not possible). I tried "netstat -a -b" and it did not report that anyone is listening on that particular port, however it did print "x: Windows Sockets initialization failed: 5" a few times. What this actually means is a complete mystery to me. Regardless, if Gradle cannot read from a tcp connection (or cannot parse what it received), it should be safe to assume that no Gradle daemon is listening on that port. I mean, nothing should go wrong on localhost (and I can't see how attaching to a daemon running on a remote host could possibly work, since they do not share the "registry.bin" with the client). What is far worse, is that I cannot start a server and listen on this port, not even after system restart. This means, that eventually no port will remain. However, this is not a problem with Gradle (i.e., it should not be able to do such thing). |
Comment by Szczepan Faber [ 29/Aug/12 ] |
Thanks a lot for the investigation - it is very helpful. |
Comment by Adam Murdoch [ 01/Sep/12 ] |
My guess for what's happening here: 1. Daemon starts up, chooses a port and writes this to the registry. @Szczepan, we should tidy up a few things here: 1. Daemon should attempt to remove its entry from the registry on JVM exit. |
Comment by Szczepan Faber [ 03/Sep/12 ] |
@Attila, Can you try the latest nightly? (http://gradle.org/nightly) The problem should be resolved now. |
Comment by Szczepan Faber [ 03/Sep/12 ] |
I think you can use the older version of Tooling API (say 1.1). The important is that the target Gradle version you connect to is the nightly. It would be very useful if you tried the nightly and told us if the problem is gone! Thanks! |
Comment by Attila Kelemen [ 03/Sep/12 ] |
I have tried it with the Tooling API I currently use (1.0) and the nightly build set as "useInstallation". I could not reproduce it on the first try, which is a good sign but I will be more confident after a few days. However, I wanted to replace "registry.bin" with the one attached to try if I can reproduce with it. Attempting to load the model failed with the following exception: java.io.InvalidClassException: org.gradle.launcher.daemon.registry.DaemonInfo; local class incompatible: stream classdesc serialVersionUID = 1483185088625753413, local class serialVersionUID = 231259346409640868 Although this is unlikely to occur in practice, it would be nice to be fixed as well. |
Comment by Attila Kelemen [ 03/Sep/12 ] |
Here is the full stack trace:
|
Comment by Szczepan Faber [ 03/Sep/12 ] |
Hey Attila, The error you see is expected. At the moment we don't provide any backwards/forwards compatibility for the daemon registry format (it's using standard java serialization). This is something we want to do in future (it is actually being discussed on the dev mailing list). Thanks a lot of taking the nightly to spin! Let us know if you run into the problem. Unfortunately, you cannot use the old registry to reproduce the issue. |