This directory contains utility classes and scripts which can be used, together 
with other tools like Hadoop, to perform an offline PageRank-calculation of a 
crawl's revealed link graph.

For more information, visit:
http://webteam.archive.org/confluence/display/Heritrix/Offline+PageRank+Analysis+Notes.

Note that the Java files in this directory are modifications of Hadoop 
example code, and are thus published under the Apache license.