[GRADLE-2867] Copy task extremely slow Created: 13/Aug/13  Updated: 10/Feb/17  Resolved: 10/Feb/17

Status: Resolved
Project: Gradle
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Gradle Forums Assignee: Unassigned
Resolution: Won't Fix Votes: 2


 Description   

Using Gradle 1.7
I have a task to install some build products into a specific folder so they are ready to use for testing.

task installSDK(type: Copy, dependsOn: publishToMavenLocal) {
from project.stagingDir
into System.getenv('MY_TEST_DIR')
}

This task only has to copy about 75MB to a few subdirectories within the target folder. The "stagingDir" already has been setup with the exact directory structure that I want to copy over, but the target folder (specified by the environment variable MY_TEST_DIR, as other developers will have this configured differently on their computers) is also already populated with a TON of other files - 99.9% of which are completely independent of what I have built in the "stagingDir".

This copy operation takes literally several minutes on a non SSD drive (I'm still waiting for it to complete so I can tell you a more precise number - the disk is thrashing away as I write.. ah it finished in about 20 minutes). On a system with a SSD drive it takes 40 seconds. It is only copying 75MB it should take a few seconds at most. It seems that the fact that the target location already has a lot of unrelated files slows it down. Is it scanning the entire target folder to build checksums for every existing file or something, rather than only checking for paths that actually match the files it will be copying?

Copying the same files to the "stagingDir" in the first place took only a second or two. But it would be either empty or populated with older versions of the same files and nothing more.



 Comments   
Comment by Gradle Forums [ 13/Aug/13 ]

Note:

If the target directory is empty it takes about 42 seconds

Adding:

outputs.upToDateWhen

{ false }

to the copy task sped it up to take only 55 seconds when the target directory was populated like it normally is.

When the target folder contains only a copy of the files copied from a previous run it takes < 20 seconds (with and without the outputs.upToDateWhen { false }

)

Clearly Gradle is doing too much work with irrelevant files.

Comment by Gradle Forums [ 13/Aug/13 ]

How many files are we talking? From what I know, after the task has completed Gradle lists all files in the target directory to find the ones written by the task (by their last changed date) so that it can hash them.

Comment by Gradle Forums [ 13/Aug/13 ]

It shouldn't take 20 minutes to list the files either. There are a lot though. The target directory has around 50000 files, but less than 2000 are copied by the task (1700 are HTML javadoc files).

Comment by Gradle Forums [ 13/Aug/13 ]

Which OS/FS? Some file systems are known to get unbearably slow when dealing with tens of thousands of files in the same directory. Have you questioned whether you need to go through a staging directory at all? Don't the files end up in an archive anyway?

Comment by Gradle Forums [ 13/Aug/13 ]

All the files are not in the same directory. The destination folder has a tree containing hundreds of sub-folders. It is Windows 7 with NTFS.

The staging directory (the source of this copy with 2k files) is what we archive, but the destination of this copy is the an "install" directory (with 50k files) where the runtime environment is configured for testing. So we need the staging directory to assemble the source of the build artifact.

Comment by Benjamin Muschko [ 15/Nov/16 ]

As announced on the Gradle blog we are planning to completely migrate issues from JIRA to GitHub.

We intend to prioritize issues that are actionable and impactful while working more closely with the community. Many of our JIRA issues are inactionable or irrelevant. We would like to request your help to ensure we can appropriately prioritize JIRA issues you’ve contributed to.

Please confirm that you still advocate for your JIRA issue before December 10th, 2016 by:

  • Checking that your issues contain requisite context, impact, behaviors, and examples as described in our published guidelines.
  • Leave a comment on the JIRA issue or open a new GitHub issue confirming that the above is complete.

We look forward to collaborating with you more closely on GitHub. Thank you for your contribution to Gradle!

Comment by Benjamin Muschko [ 10/Feb/17 ]

Thanks again for reporting this issue. We haven't heard back from you after our inquiry from November 15th. We are closing this issue now. Please create an issue on GitHub if you still feel passionate about getting it resolved.

Generated at Wed Jun 30 12:33:37 CDT 2021 using Jira 8.4.2#804003-sha1:d21414fc212e3af190e92c2d2ac41299b89402cf.