Lately, though, we had been seeing a sudden rash of "Unable to create target directory" errors when deploying packages using PDQ Deploy. The company has an excellent troubleshooting page for how to deal with the error (spoiler alert...a reboot of the target machine usually fixes it). Unfortunately, the error kept coming back. I wanted to dig deep to understand what was happening so that we could address the root cause of the problem.
Sounds like a job for procmon...
Reproducing the Error
The first step of troubleshooting is to reliably reproduce the error.
This is often the most difficult step. And, in fact, it's a multi-step process itself:
- Reproduce the error at least once
- Reliably reproduce the error every time
- Remove as many variables as possible while still reproducing the error
That last step is critical. The more you can simplify the bug, the fewer sources of errors you have to track down.
The World's Simplest PDQ Deploy Package
I wanted to rule out the possibility of errors in the deployment package itself.
To do that, I tried to imagine the simplest command that could not possibly fail. It doesn't get much simpler than the humble ECHO command:
How PDQ Deploy Works
To follow along with the rest of this article, it helps to know how PDQ Deploy works.
The article describes 9 steps for a typical install, but many of those steps don't apply to what we're doing here with our simplified package. In fact, our scenario never gets past step two:
Step 1: The PDQ Deploy Background Service running on the target computer retrieves the command to be executed:
Step 2: A Windows Service is created on the target and is called PDQDeployRunner-n (-n will usually be "1"). This is referred to as the "Runner" service.
As we'll see below, the problem is that the PDQDeployRunner-n service never gets created in the first place.
The Debugging Process
What follows are the actual steps I took and wrote up in our FogBugz bug tracker.
I have two goals with this article:
- Demonstrate procmon's power as a low-level debugging tool
- Share my thought process to hopefully help you in your own debugging
I'm going to show each entry from my bug-tracker numbered in order of when they were created. Under a few of the headings, I'll provide a bit of context in italics.
Entry 1. Capturing Events with Procmon
I deployed the package to the same device that is running PDQ Deploy console. This made troubleshooting much easier and eliminated a major potential source of errors: the network.
I ran the no-op deploy pkg on GBWayne18 and I was able to reproduce the "Unable to create target directory" error.
I then re-ran the package while capturing events via procmon.
I think this is the problem:
It looks like the file has been marked for deletion at next reboot.
With PDQDeployRunner-1.exe not available, it then keeps trying to create new temporary Runner-N.exe files:
It stops at -16.exe.
I'm not sure why it doesn't keep going to -17.exe. If it did, it looks like it would be able to work:
Entry 2. A Working Theory
Here's my working theory:
- Every time a PDQ Deploy package is deployed a new PDQDeployRunner-N.exe is created
- After the package completes, the executable is marked for deletion at next bootup
• This means that the same executable cannot be reused
- The next time a PDQ Deploy package is deployed, the next available PDQDeployRunner-N.exe is created
- There is a hard-coded cutoff at 16
• (or so it seems)
• There is no "NAME NOT FOUND" entry in procmon or any entry at all trying to create a -17 folder
- Rebooting the machine deletes all of the PDQDeployRunner-N.exe files and folders
- Following the reboot, a maximum of 16 packages may be deployed before the "Unable to create target directory" error reappears
Entry 3. The "DELETE PENDING" status
I think this might explain the DELETE PENDING status:
Have you considered the
MoveFileExAPI with the flag
lpNewFileNameparameter? That way, the file is in use, you can let OS handle it, even if the application crash and the
clsApp's terminate event never run.
Maybe we can change the status of the folders/files directly?
Entry 4. Manual Deletion Fails
I tried deleting one of the -N.exe executables. Windows seemed to let me delete it, but if I navigated away from the folder and then went back to it the file had re-appeared.
Entry 5. Reboot the Machine
I rebooted GBWayne18. As I suspected, the PDQDeployRunner-N.exe executables were all gone following the reboot (though each\service-N\ subfolder was still there):
Entry 6. Verify the Bug is Fixed
I re-ran the PDQ Deploy no op pkg on GBWayne18 following the reboot.
The task ran in only 5 seconds.
The PDQDeployService.exe executable created and deleted a bunch of files inside the \service-1\ folder. As part of its cleanup, it also deleted the \service-1\ folder itself:
Entry 7. Verify the Bug is Fixed on Subsequent Runs
I re-deployed the no op pkg a second consecutive time and it created and deleted a temporary PDQDeployRunner-1.exe file a second consecutive time.
My new working theory is that PDQ Deploy intends to create and delete the temporary files each time it runs. However, sometimes it's not able to delete the temporary files. Maybe because the deployment hangs. Maybe because a second deployment starts before the first one finishes.
In those situations, it marks the existing temporary files to be deleted on reboot. The problem is that once those files have been marked to delete on reboot, they can't be deleted outside of a reboot.
This is an exceptional enough situation that the PDQ Deploy developers must have decided to hard-code a limit of 16 such temporary folders.
Once that limit is reached, it seems the only way to recover from it is to reboot the machine.
Entry 8. Final Resolution Notes
I like to end each case with a summary of what steps to take if the error recurs. It's nice to find a previous error in the bug database and have it provide you a brief set of action steps. This last entry appears at the top of my bug case so it's the first thing a future researcher will see.
If we get this error on a target machine, check the following folder location: \\TargetMachine\ADMIN$\AdminArsenal\PDQDeployRunner\
If there are 16 subfolders with embedded .exe files, then you will need to reboot the computer. \service-1\PDQDeployRunner-1.exe ... \service-16\PDQDeployRunner-16.exe
A reboot should clear the executable files from those folders (though the folders themselves will remain, at least initially).