Errors in the January 19 snapshot of EPA.gov are problematic from transparency, data preservation, and information access standpoints and may have legal implications

Posted on Posted in Blog

By Sarah Lamdan, Director of Legal Research, and members of the Website Tracking Committee: Andrew Bergman, Maya Anjur-Dietrich, Gretchen Gehrke, and Toly Rinberg (For inquiries, please contact Toly Rinberg at edgi.websitetracking@protonmail.com)

 

May 5, 2017 11:00 AM ET— On February 15, the Environmental Protection Agency (EPA) posted a “mirror” or “snapshot” of its website as it existed on January 19, the day before Trump’s inauguration [1]. According to an email from EPA staff, reported on by TechCrunch and by InsideEPA, the agency posted the snapshot website in response to FOIA requests it had received [2]. The snapshot, however, is not comprehensive and contains errors beyond those that the EPA’s initial email and the notice currently available about the snapshot say should be expected. The snapshot website in its current form, with these errors, provides an incomplete view of former EPA materials and records, which are increasingly relevant as the content on the current EPA website is changed and restructured. The errors in the January 19 snapshot are problematic from transparency, data preservation, and information access standpoints, and the errors may even have legal implications if the snapshot website skirts records transparency and destruction laws.

EDGI’s Website Tracking Committee has released a report detailing a particular instance of loss of access to information concerning the EPA’s website “A Student’s Guide to Global Climate Change”, an educational site for kids with more than 50 web pages, which is currently not accessible from either the current EPA’s website nor the EPA’s January 19 snapshot. In a news release from April 28, 2017, the EPA announced that EPA.gov “is undergoing changes that reflect the agency’s new direction under President Donald Trump and Administrator Scott Pruitt” [3]. Coinciding with this news release, EPA.gov and multiple subdomains, including the “Student’s Guide” pages, began redirecting visitors to an update notice page [4]. A notice on each page now states that “If you’re looking for an archived version of this page, you can find it on the January 19 snapshot.” However, the “Student’s Guide” web pages were hosted under the “www3.epa.gov” subdomain instead of the “www.epa.gov” subdomain, and are not included in the January 19 www.epa.gov snapshot. As a result, a verified historical record of those pages is not currently available, nor is the website accessible by navigating within the “www.epa.gov” domain [5]. Although this error in failing to include the “Student’s” website in the snapshot is likely an accidental omission or copying error, it demonstrates the importance of properly documenting and systematically coordinating agency website changes and that errors can be made when agencies do not properly publicly document what changes are made during a rapid overhaul of a large and complex website.

The loss of access to information contained on the “Student’s Guide” pages is not just an information access issue; it raises legal questions, as well. Federal agencies are bound by transparency and records maintenance laws and processes, such as the Freedom of Information Act (FOIA), which mandates public access to certain types of agency records [6]. According to the FOIA, agencies must make all agency records that have been requested three or more times available for public inspection online, regardless of the record’s form or format [7]. (This section of the FOIA is sometimes called the “Beetlejuice” provision.) The EPA has not publicly stated whether the EPA.gov snapshot was posted as a result of the “Beetlejuice” provision, but the agency has listed the snapshot on the EPA’s FOIA page as a frequently requested record. Additionally, the initial email from EPA staff announcing the January 19 snapshot, mentioned above, specified that it was posted in response to “numerous FOIA requests regarding historic versions of the EPA website.” The EPA also referred to the snapshot’s placement online “as required by law” in its initial statement about the snapshot site, which was later modified to remove that clause [8].

A version of the EPA’s “Student’s Guide to Global Climate Change” splash page

According to EPA FOIA logs, the agency has received at least three requests for all climate change information, including web pages, that were publicly available before Trump’s inauguration and other requests for all or part of the EPA website as it existed prior to January 20, 2017 [9]. In addition, based on the FOIA records found so far, such as this example, it is  possible that the EPA received at least three FOIA requests for the entire EPA.gov domain as it existed on January 19, 2017. If, in fact, the EPA did receive three or more FOIA requests for the EPA website, then the requests would meet the “Beetlejuice” provision and legally require the EPA to make the website snapshot accessible online. If the “Beetlejuice” provision requirements apply, then the current iteration of the epa.gov snapshot fails to satisfy those FOIA requirements.

In addition to FOIA requirements for agency records access, the Federal Records Act governs the preservation, maintenance, and destruction of federal agency records and requires records handling practices that ensure public access to government information. These practices are especially important at the EPA, an agency that gathers, produces, and provides access to a large quantity of data about the environment that has been collected by programs both inside and outside the agency. The EPA’s snapshot website does not appear to be the product of careful execution of proper electronic records preservation measures as described by the National Archives and Records Administration (NARA) [10]. With all of the changes being made to various EPA web pages, the EPA must follow NARA-recommended processes and be careful to maintain copies of restructured or otherwise altered web pages, or they may run afoul of the Federal Records Act.

EDGI has filed multiple FOIA requests to learn more about the EPA snapshot website, the rationale for posting it, and the status of currently unavailable portions of the snapshot. The EPA has not yet provided a response to a FOIA request filed for “records related to the posting of the mirror EPA website.” Additionally, the the EPA has not provided an easily-searchable copy of the FOIA logs from January 1, 2017, to February 16, 2017, the date that the snapshot site became available, instead directing us to the FOIAonline portal to determine how many requests were made for a snapshot of EPA.gov. Despite the limited search functionality of FOIAonline, we will continue to examine the FOIA logs in search of this information.

EDGI continues to support careful, thorough, and well-planned data and information preservation efforts within federal agencies that ensure public access to information because, if environmental and other records are not properly managed, then crucial data and information may not be available for effective evidence-based policymaking.

 


References:

[1] EPA.gov Snapshot Page, https://19january2017snapshot.epa.gov/ (last visited May, 4 2017), as described in reporting by TechCrunch and several other news stories.

[2] PDF of email, as reported by InsideEPA (password required). A non-password protected copy of the email is available here.

[3] Read the Website Tracking Committee’s press release about the EPA.gov overhaul.

[4] Reported on by the Washington Post.

[5] Through a keyword search, using a web search engine like Google, it is possible to find a very similar website to the “Student’s Site.” As described in the Website Tracking Committee’s report, however, this website is not linked from the www.epa.gov domain, nor can it be accessed from the January 19 snapshot. Google’s index checking service indicates that the “//kids” domain is also not well-indexed by Google, indicating it is significantly less accessible. Furthermore, the website is effectively not accessible using search without prior knowledge of the page or its contents.

[6] The Freedom of Information Act, 5 U.S.C. § 552.

[7] The Freedom of Information Act, 5 U.S.C. § 552(a)(2)(D)(ii)(II). This provision has been called the “Beetlejuice” provision, and is described in this statement by the Center for Biological Diversity.

[8] The original EPA press release about the website updates referred to the snapshot posted “as required by law”. This phrase was removed shortly after in an updated statement.

[9] EPA FOIA logs are available at foiaonline.regulations.gov. Our partial list can be found here.

[10] See, for example, NARA Guidance on Scheduling Web Records (setting out requirements for scheduling the destruction of web content), as well as guidance on the transfer of electronic materials to NARA and NARA’s Electronic Records Management Initiative.