SEO Forensics

Tracking Articles

Spider Spotting

Use spider spotting to learn precisely when Google, Wayback, or any robot which keeps cache copies of your code, visited your page by adding a "cache time-stamp." I use the local system time of my server which is available for easy fetching wtih environment variables. The DATE_LOCAL var matches the clock as your server responds to the request. This is an easy way to do basic forensics for spider spotting, AMP pages (AMPHTML) or to investigate copyright infringement.

  • Spider spotting
  • IP address collection
  • User-agent, referrer etc.

There are different ways to implement a time stamp. I like to use server side includes (SSI). In this case, you'll want to echo the DATE_LOCAL environment variable and format the time configuration to your liking. Includes are interpolated (embedded) into pages using HTML comment syntax with the server side include syntax to be interpreted by the machine when SSI is enabled.

<!--#function attribute=value-->

You'll want to configure the time format to look how you want.

config timefmt="%a %b %d %y, %I:%M %p %Z"

The first part (before the comma) formats the way day, month and year will display. The second part formats the way time appears. At run time, SSI directives which take place after your time configuration statement, such as to echo the DATE_LOCAL variable, will return your custom formatted local server time. Be aware that your machine's local time is going to be wherever you host. Page code can echo system time anywhere you want with SSI.

<!--#config timefmt="%a %b %d %y, %I:%M %p %Z"-->
<!--#echo var="DATE_LOCAL"-->

While the time stamps in our footer make it easy to see our cache copies, you certainly don't have to publish anything visible to your users. Simply write includes nested inside an HTML comment, and it won't display in the page. Try it on a page or two, and then take a look at the remote source of a cache copy. You can use this for more than the time. We also stamp IP addresses of the machines making requests. It's forensics.

Apache Time Format Documentation

Currently writing draft

Archive

searchreturn llc
© 1995-2017 All Rights Reserved
Information Security
Modified Sat Jul 22 17, 10:51 PM EDT
Cache Sun Sep 24 17, 11:36 AM EDT