SEO Forensics

Tracking Stuff

Spider Spotting

Spider spotting is for learning precisely when Google, Wayback, or any robot which keeps cache copies of your code, visited your page. I use the local system time, easily available through the DATE_LOCAL environment variable. This makes it relatively easy to do basic forensics with spider spotting to investigate copyright infringement, or on key pages like AMP (AMPHTML), or anything else where you want to know when the page was requested.

  • Spider spotting
  • IP address collection
  • User-agent, referrer etc.

There are several ways to implement a time stamp from environment variables. I'm using server side includes (SSI) here but you can use PHP just as easily. You'll want to echo the DATE_LOCAL environment variable and format the time configuration to your liking. Includes are interpolated (embedded) into pages using HTML comment syntax.

<!--#function attribute=value-->

You'll want to configure the time format to look how you want.

config timefmt="%a %b %d %y, %I:%M %p %Z"

The first part (before the comma) formats the way day, month and year will display. The second part formats the way time appears. At run time, SSI directives which take place after your time configuration statement, such as to echo the DATE_LOCAL variable, will return your custom formatted local server time. Be aware that your machine's local time is going to be wherever you host. The cache stamps here are going to be Eastern Standard Time. Google uses GMT, so you may need to account for the difference depending what information you want..

<!--#config timefmt="%a %b %d %y, %I:%M %p %Z"-->
<!--#echo var="DATE_LOCAL"-->

While the cache stamps in our footer make it easy to see our version, you certainly don't have to publish anything visible to your users. Simply write includes nested inside an HTML comment source and it won't display visibly on the page. Try it on a page or two, and then take a look at the remote source of a cache copy. You can use this for more than time. We also stamp IP addresses into cache machines. It's about forensics.

Apache Time Format Documentation

References

Search Outsiders?
Stay Tuned

  • "I just got my first issue and lots of good stuff in it already. It's just kind of nice."

    2005-10-14 Danny Sullivan
    Google
  • "Your insights always blew me away"

    2012-12-19 Peter DeLegge
    Motorola
  • "Detlef stands among the best known and universally respected personalities in the search marketing industry. One of the dozen or so SEO pioneers."

    2009-09-14 Daron Babin
    Webmaster Radio
  • Nicely done! Cool to see this kind of monitoring on an app level.

    2017-06-15 Penetration Tester
    Jet.com

 searchreturn llc
© 1995-2019 All Rights Reserved
Information Security

Modified Thu Jan 24 19, 11:48 AM EST
Cache Sat Feb 23 19, 11:26 PM EST