I am issuing many parallel robocopy calls to copy files from one network share into one directory. Since the files are read only I tell robocopy to strip off the read only attribute in the target directory via /A-:R. It seems that on some many core machine (12 or more) the target directory! gets locked for up to 16s.
This problem does surface when concurrent MSBuild tasks are running and the CopyFile task is executed on read only files. It does also happen when robocopy is executed to download dependencies for a TFS build in parallel from a network share. Since all these issues point to kernel32 CopyFile (or its private implementation) I suspect that the problem is related to how Windows does copy files.
This does not seem to be a general issue in the kernel since the temp folder live from the fact that concurrent access to a directory must be possible. But the user mode implementation inside kernel32.dll of CopyFile seems to be flawed.
Update 2
With the repro below this does happen no matter if the file is read only or not.
Update 3
This repro does also show the same issue on Windows 8.
The procmon stack traces did show that the magic happens in kernel32.dll inside
PrivCopyFileExW which seems to be rather undocumented. There an IRP_MJ_CREATE call is issued to open the directory and a little later the directory is closed. This seems to be the root cause of the race condition when many parallel robocopy processes try to copy files into one directory.
Here is some procmon output how this problem feels like.
Why on earth does PrivCopyFileExW manage to lock the directory? A file system should be able support copying files into one directory. I am using Windows Server 2008 R2 and some latest multi cores machines with RAID arrays, SSDs and such stuff.
This seems to be related to reported problems with CopyFile in kernel32.dll which have not be solved until today. I can rule out virus scanners because this does also happen at machines which have none installed.
Update 1
It seems that another robocopy process does try to copy a file to the destination directory which opens the directory
Date & Time: 20.03.2012 08:30:06
Event Class: File System
Operation: CreateFile
Result: SUCCESS
Path: C:\temp\dest
TID: 11672
Duration: 0.0000150
Desired Access: Read Data/List Directory, Write Data/Add File, Write EA, Read Attributes, Write Attributes, Delete, Synchronize
Disposition: OpenIf
Options: Directory, Synchronous IO Non-Alert, Open For Backup
Attributes: D
ShareMode: None <---- No sharing
AllocationSize: 0
OpenResult: Opened
0 fltmgr.sys FltpPerformPreCallbacks + 0x2f7 0xfffff88001045027 C:\Windows\system32\drivers\fltmgr.sys
1 fltmgr.sys FltpPassThroughInternal + 0x4a 0xfffff880010478ca C:\Windows\system32\drivers\fltmgr.sys
2 fltmgr.sys FltpCreate + 0x293 0xfffff880010652a3 C:\Windows\system32\drivers\fltmgr.sys
3 ntoskrnl.exe IopParseDevice + 0x5a7 0xfffff800031cb537 C:\Windows\system32\ntoskrnl.exe
4 ntoskrnl.exe ObpLookupObjectName + 0x585 0xfffff800031c1ba4 C:\Windows\system32\ntoskrnl.exe
5 ntoskrnl.exe ObOpenObjectByName + 0x1cd 0xfffff800031c6b7d C:\Windows\system32\ntoskrnl.exe
6 ntoskrnl.exe IopCreateFile + 0x2b7 0xfffff800031cd647 C:\Windows\system32\ntoskrnl.exe
7 ntoskrnl.exe NtCreateFile + 0x78 0xfffff800031d7398 C:\Windows\system32\ntoskrnl.exe
8 ntoskrnl.exe KiSystemServiceCopyEnd + 0x13 0xfffff80002eca813 C:\Windows\system32\ntoskrnl.exe
9 ntdll.dll NtCreateFile + 0xa 0x7718fc0a C:\Windows\System32\ntdll.dll
10 kernel32.dll BaseCopyStream + 0x11a9 0x77034b89 C:\Windows\System32\kernel32.dll
11 kernel32.dll BasepCopyFileExW + 0x545 0x77033d85 C:\Windows\System32\kernel32.dll
12 kernel32.dll PrivCopyFileExW + 0xb6 0x770b5296 C:\Windows\System32\kernel32.dll
13 Robocopy.exe CZDir::CopyData + 0xb5 0xff8623a9 C:\Windows\System32\Robocopy.exe
14 Robocopy.exe RoboCopyDir + 0xe4 0xff85af00 C:\Windows\System32\Robocopy.exe
15 Robocopy.exe Walk + 0x12a 0xff85c6b6 C:\Windows\System32\Robocopy.exe
16 Robocopy.exe wmain + 0x4f4 0xff85de78 C:\Windows\System32\Robocopy.exe
17 Robocopy.exe operator+ + 0x19b 0xff867be5 C:\Windows\System32\Robocopy.exe
18 kernel32.dll BaseThreadInitThunk + 0xd 0x7703f33d C:\Windows\System32\kernel32.dll
19 ntdll.dll RtlUserThreadStart + 0x1d 0x77172ca1 C:\Windows\System32\ntdll.dll
The other robocopy wants to check if the file already exists and calls FindFirstFile which lead to opening the directory as well with full sharing.
Date & Time: 20.03.2012 08:30:06
Event Class: File System
Operation: CreateFile
Result: SHARING VIOLATION
Path: C:\temp\dest
TID: 8280
Duration: 0.0000099
Desired Access: Read Data/List Directory, Synchronize
Disposition: Open
Options: Directory, Synchronous IO Non-Alert
Attributes: n/a
ShareMode: Read, Write, Delete <----- Full sharing
AllocationSize: n/a
0 fltmgr.sys FltpPerformPreCallbacks + 0x2f7 0xfffff88001045027 C:\Windows\system32\drivers\fltmgr.sys
1 fltmgr.sys FltpPassThroughInternal + 0x4a 0xfffff880010478ca C:\Windows\system32\drivers\fltmgr.sys
2 fltmgr.sys FltpCreate + 0x293 0xfffff880010652a3 C:\Windows\system32\drivers\fltmgr.sys
3 ntoskrnl.exe IopParseDevice + 0x5a7 0xfffff800031cb537 C:\Windows\system32\ntoskrnl.exe
4 ntoskrnl.exe ObpLookupObjectName + 0x585 0xfffff800031c1ba4 C:\Windows\system32\ntoskrnl.exe
5 ntoskrnl.exe ObOpenObjectByName + 0x1cd 0xfffff800031c6b7d C:\Windows\system32\ntoskrnl.exe
6 ntoskrnl.exe IopCreateFile + 0x2b7 0xfffff800031cd647 C:\Windows\system32\ntoskrnl.exe
7 ntoskrnl.exe NtOpenFile + 0x58 0xfffff800031e64a8 C:\Windows\system32\ntoskrnl.exe
8 ntoskrnl.exe KiSystemServiceCopyEnd + 0x13 0xfffff80002eca813 C:\Windows\system32\ntoskrnl.exe
9 ntdll.dll NtOpenFile + 0xa 0x7718f9ea C:\Windows\System32\ntdll.dll
10 KernelBase.dll FindFirstFileExW + 0x1ee 0x7fefd3a560e C:\Windows\System32\KernelBase.dll
11 KernelBase.dll FindFirstFileW + 0x1c 0x7fefd3a58dc C:\Windows\System32\KernelBase.dll
12 Robocopy.exe CZDir::Exists + 0xf7 0xff861bb7 C:\Windows\System32\Robocopy.exe
13 Robocopy.exe RoboCopyDir + 0x58 0xff85ae74 C:\Windows\System32\Robocopy.exe
14 Robocopy.exe Walk + 0x12a 0xff85c6b6 C:\Windows\System32\Robocopy.exe
15 Robocopy.exe wmain + 0x4f4 0xff85de78 C:\Windows\System32\Robocopy.exe
16 Robocopy.exe operator+ + 0x19b 0xff867be5 C:\Windows\System32\Robocopy.exe
17 kernel32.dll BaseThreadInitThunk + 0xd 0x7703f33d C:\Windows\System32\kernel32.dll
18 ntdll.dll RtlUserThreadStart + 0x1d 0x77172ca1 C:\Windows\System32\ntdll.dll
I can repro this easily on Windows 7 as well. You only need to copy read only files from two parallel robocopy calls into the same directory in a loop and wait until it happens (ca. 30s).
for /L %i in (1,1,1000) do robocopy /E /XO /COPY:DAT /A-:R C:\ReadOnlySource1 c:\temp\dest
for /L %i in (1,1,1000) do robocopy /E /XO /COPY:DAT /A-:R C:\ReadOnlySource2 c:\temp\dest
You can put only one read only file into the source directories to get a fast copy and many concurrent directory accesses. Is this a known limitation of Windows to not allow accessing a directory while a file is copied to it?
My uneducated opinion is that this is a bug and it can get quite nasty when you want concurrent access to files in a reliable manner.
It looks like we are getting a fix from MS for this issue finally. They have found and understood the issue. But it will take some time until the fix is officially prepared. Currently it will be only fixed for Windows 7.