EDGI and PageFreezer, teaming up to archive environmental websites

Posted on Posted in Blog

Our collaboration with PageFreezer began on the eve of President Trump’s inauguration. Michael Riedijk heard about EDGI’s website archiving and monitoring efforts and reached out in an email saying: “I’m the CEO of PageFreezer.com, the largest commercial website archiving provider and I believe this is an important cause.” Anticipating rapid changes to federal websites, that same evening, EDGI’s Website Monitoring Team sent PageFreezer a list of website domains and tens of thousands of individual pages for immediate archiving.

PageFreezer is based out of Vancouver, Canada and provides “an online Software-as-a-Service solution that automates the process of archiving websites, blogs and social media.”

Through a generous contribution, PageFreezer provided us with a pro-bono subscription to their web crawling platform while we arranged dedicated hardware and hosting for the service. Together, we’ve worked to build a large archive of federal environmental websites over the first year of the Trump administration. This archive includes millions of files and pages from EPA.gov, NOAA.gov, ENERGY.gov, and NASA.gov. As the archive has grown, PageFreezer has graciously offered to take up the costs of the cloud computing resources.

We believe this archive can be of immense public value for others to study, so we’re now working with PageFreezer to move the data to our partners at the Internet Archive, where it will be publicly accessible via the Wayback Machine. PageFreezer is converting the archives to the WARC format so they can be added into the Internet Archive.

If any of this sounds interesting to you, EDGI is always excited to have more volunteers and PageFreezer is looking for driven, authentic, collaborative people to join their ranks.

Thanks so much to the PageFreezer team and we look forward to continuing to work together to preserve and monitor federal resources for the public good!