Cache Forensics

To know precisely when Google, Wayback, or any robot which keeps cache copies of your code visited your page, add a "cache stamp" using system time. Server environment variables, specifically DATE_LOCAL, matches the clock when your server responded to the request. This is an easy way to do basic forensics for things like spider spotting, or to investigate copyright infringement.

  • Spider spotting
  • IP address collection
  • User-agent, referrer etc.

There are different ways to implement a time stamp. I like to use server side includes (SSI). In this case, you'll want to echo the DATE_LOCAL environment variable and format the time configuration to your liking. Includes are interpolated (embedded) into pages using HTML comment syntax with the server side include syntax to be interpreted by the machine when SSI is enabled.

<!--#function attribute=value-->

You'll want to configure the time format to look how you want.

config timefmt="%a %b %d %y, %I:%M %p %Z"

The first part (before the comma) formats the way day, month and year will display. The second part formats the way time appears. At run time, SSI directives which take place after your time configuration statement, such as to echo the DATE_LOCAL variable, will return your custom formatted local server time. Be aware that your machine's local time is going to be wherever you host. Page code can echo system time anywhere you want with SSI.

<!--#config timefmt="%a %b %d %y, %I:%M %p %Z"-->
<!--#echo var="DATE_LOCAL"-->

While the time stamps in our footer make it easy to see our cache copies, you certainly don't have to publish anything visible to your users. Simply write includes nested inside an HTML comment, and it won't display in the page. Try it on a page or two, and then take a look at the remote source of a cache copy. You can use this for more than the time. We also stamp IP addresses of the machines making requests. It's forensics.