2. The MHTML File Format
Hunchly captures all web pages in the MHTML format. This format is very similar to how emails are structured, they contain headers with information describing the page itself, the timestamp of when Chrome itself captured the page and it also includes all of the text, CSS styles and images that are contained on the page. All in a single file. This is superior to PDF or screenshots as all links are maintained, the layout is generally more accurate and all metadata is preserved including the metadata in the captured images.
It is worth noting that the MHTML capture is 100% handled by Google Chrome, Hunchly simply instructs Chrome to do the capture and then pulls out the result for storage and analysis.
The beginning of the MHTML file has a header that has some high-level metadata that may be useful when doing disclosure or validation:
From: <Saved by Blink> Snapshot-Content-Location: https://www.facebook.com Subject: Facebook - Log In or Sign Up Date: Tue, 28 Aug 2018 15:47:07 -0000 MIME-Version: 1.0 Content-Type: multipart/related;
From: this is just mentions that it was the Chrome Blink engine that captured the page.
Snapshot-Content-Location: the URL of the page that was captured.
Subject: the HTML title of the page.
Date: the timestamp in UTC of when Chrome saved the page and sent it to Hunchly for storage.
Hunchly's Storage and Identification of Pages
Hunchly stores the pages from your case by utilizing a PAGE ID that is unique to each page. This ID is global, which means that it is across all of your cases, so you could have CASE 1 that has Page ID 1 and CASE 2 that has Page ID 2. When Hunchly saves MHTML content the file is named: PAGEID.mhtml.
Potential Evidence Challenges
When submitting MHTML pages for disclosure or court purposes you may have the issue where the timestamp in the MHTML file does not match the timestamp that Hunchly has listed for the capture. There are two explanations for this that you can communicate:
1. Hunchly timestamps the data in your local time zone. To get an accurate match between the MHTML file and the timestamp produced in a Hunchly export you need to convert the UTC time to your local timezone. You can alternatively change your local clock to UTC/GMT and then the timestamps should be the same.
2. There is a slight delay between when Chrome captures the page and when it forwards it to Hunchly for processing. This delay is dependent on how large the page is, how many pages Hunchly has queued up for storage or the overall performance of your computer. This may result in timestamps that are not the same between the MHTML file and what Hunchly shows, but is easily explained as other parts of the evidence such as the SHA-256 hash and GPG signature will still match the content to ensure that the evidence stands up to scrutiny.
Non-Continuous Page IDs
When submitting evidence to the court or a third party you may be questioned why there are non-continuous Page IDs in the submitted evidence, some explanations you can provide:
1. You work multiple cases that are separate but the Page ID is universal across the entire system. The number increments regardless of what case you are working on but the cases themselves are logically separated on the investigators hard drive and in the Hunchly database.
2. You have deleted a page which creates a "gap" in the Page IDs submitted. You can explain the deletion because Hunchly has a deletion log that you can view to explain any deletions for a case should you be required to do so.
If you have additional questions, require clarification or have experienced evidentiary challenges of Hunchly data please email us: firstname.lastname@example.org