In my recent article about Microsoft disabling all VBA code in files downloaded from the internet, Ben Sacherich wrote in the comments:
Mike, Can you explain more about the Mark of the Web? Where does this "mark" exist on a file without affecting it's integrity? If I had two identical files but one with the mark and the other without the mark, would they have different CRC or SHA values? I bet you're a geek enough to also question how this is done and maybe have already discovered the answer. A follow up question would be, if I download a zipped Access file from the web, after unzipping it does it contain the MOTW? If no, then another work-around is to transfer files wrapped in a compression format. If yes, if I remove the MOTW from the zip file before unzipping, do I get the same result?
Great questions, Ben! Let's take these one at a time.
Can you explain more about the Mark of the Web?
Where does this "mark" exist on a file without affecting its integrity?
The "mark" does not exist on the file itself.
Rather, it exists as part of the NTFS file metadata. NTFS is the standard Windows file system. It has support for a feature known as "alternate data streams." The "mark" is stored in an alternate data stream of the downloaded file.
Think of data streams as nothing more than a contiguous series of ones and zeroes. The primary data stream is where the main file contents are. The alternate data stream is a different set of ones and zeroes.
Most alternate data streams are relatively small. They hold metadata about the file. But alternate data streams–as opposed to extended file attributes–are variable length. In fact, nothing in the file system prevents the alternate data stream from being many times larger than the primary data stream.
Let's run a quick experiment to learn more.
Exposing the Mark of the Web
For this experiment, I'll be using a compiled version of the HelloWorld twinBASIC sample application.
- Download the HelloWorld executable: HelloWorld.exe
- Open a cmd window and
cdto your Downloads folder
dir /?to display help for the directory command
- Check out the
/R Display alternate streams of the file.
- Display the alternate data stream for the downloaded file:
dir HelloWorld.exe /R
- Open the alternate data stream in notepad:
If I had two identical files but one with the mark and the other without the mark, would they have different CRC or SHA values?
If you are not familiar with the terms "CRC" or "SHA", those are different algorithms for generating file checksums.
Checksums help verify file integrity.
The idea is that you take a file of any size, run it through a mathematical algorithm, and it produces a fixed-length set of data that is (relatively) unique to the file you started with.
- Are deterministic: The same input will always produce the same output.
- Seek to minimize collisions: A collision is when two different inputs create the same output.
- Cannot avoid collisions entirely: Due to the pigeonhole principle.
How is this helpful?
Let's say you have a massively popular image viewer utility. You fear that its popularity will make it a target for attackers. An attacker could perform a man-in-the-middle attack–intercepting the download, injecting malware, then passing it along to the unsuspecting user.
As the software provider, you could post the checksums along with the algorithms you used to generate them. Then, after downloading the files, your users could run the same algorithm on the downloaded files to verify that they have not been changed.
hashcheck - A Handy Utility
choco install hashcheck
Here are the checksums for the HelloWorld.exe that you downloaded earlier:
- CRC-32: 72239332
- SHA-1: b5593fdbe226c32dd34668d48e5f9d2442066765
- SHA-256: 9976553da9ccd8468d216bb18d717196d0d92c38214fd49718124d471b903cb6
- SHA-512: 3d8d9430ea9dc50bfa7979966b61e4236f064c837b77d355670ff93a6b9f7c19a4cff5d5ff8e47902f9a7a8e8fec0fb7d5ba8943cd75f9c31c4d58521c8e7050
To verify that the Mark-of-the-Web does not impact the checksum values, you can check them before and after removing the mark.
If I download a zipped Access file from the web, after unzipping it does it contain the MOTW?
- YES, for the default unzip utility that ships with Windows
- NO, for 7-Zip (if you explicitly extract the files before use)
I just so happen to have such a file available for download from my DoEvents demonstration:
The downloaded zip file bears the MOTW:
The Built-in Windows Unzip Utility Preserves the MOTW
Now, let's unzip the contents without unblocking the zip file first:
The MOTW remains:
7-Zip Leaves the MOTW Behind
Let's try unzipping the blocked file using the popular file compression utility, 7-Zip:
The MOTW is missing on the file that 7-Zip extracted!
It turns out that the Windows unzip utility has special code that transfers this alternate data stream information from the containing .zip file to the extracted contents. Since 7-Zip does not have that same special code, the files get taken out the exact same way they got put in.
Eric Lawrence covers this situation–along with testing several other popular file compression utilities–in his excellent article on the topic, Downloads and the Mark-of-the-Web.
If no, then another work-around is to transfer files wrapped in a compression format.
As noted above, this work-around would also require:
- that your users have the 7-Zip utility installed
- that they use the right-click menu to explicitly extract files before running them
Ordinary users are unlikely to answer yes to either of the above questions.
If yes, if I remove the MOTW from the zip file before unzipping, do I get the same result?
By removing the MOTW from the zip file, it will prevent it from being transferred to the files as they get extracted and "rehydrated."
If you do trust the contents of a zip file, it's much better to unblock the zip file before extraction. Otherwise, the MOTW gets transferred to every single file inside the zip folder, and they need to get unblocked individually.