Data Feeds

To help guide and stimulate discussion on sharing machine-readable Recovery data, we developed some simple demonstration implementations of Recovery Act web services. Recovery Act data is complex, and Recovery data can be used for many applications ranging from public oversight and scrutiny of Recovery activities to support of job-placement services. In addition, there may be important roles Recovery Act web services may play to support processes within the Federal government.

Because of the diversity of potential needs and applications, as well as the inherent complexity of Recovery Act data, our small team did not develop a comprehensive design solution. Instead, these demonstrations are meant to be illustrative examples exploring design patterns for Recovery data sharing web services. Beside more fully exploring data sharing needs, future work should also explore ways to enable two-way communication of information related to the Recovery. For example, users of Recovery data may discover errors or other problems that warrant further investigation. Recovery.gov should have services that accept user annotation, error flagging, and comments about Recovery reports and collections of reports. We recommend that Recovery.gov developers consider the Atom Publishing Protocol as a basis for building services accepting such data from the public.

Demonstration Discussion

The demonstration services provided at this site illustrate simple methods for sharing machine-readable Recovery data via Atom feeds. The Technology Overview section of this site describes why we think Atom feeds should be applied for ARRA data dissemination. Additional background and technical information about using Atom for Recovery data dissemination can be found in the technical report that accompanies this demonstration, as well as this technical discussion page.

  1. We started by generating a simulated set of XML Recovery reports that closely matched OMB's requirements. The simulated (fake) XML data we generated can be found here (warning: this is a large file of nearly 200 reports).

  2. We then ran an XSL transformation on these simulated XML Recovery reports. This XSL created an Atom Feed where individual Recovery reports are represented in individual entries in the feed described below. The XSL document can be found here. Additional technical discussion of our Atom implementation can be found here.

  3. The XSL file described above references a second XSL document used to transform the simulated XML recovery data into HTML for rendering on browsers. This HTML representation of Recovery data is put into the content element of the Atom feed entries. Similar transforms can be used on Recovery.gov to generate web pages of Recovery reporting data from XML data sources. In addition, other transformations can be used to make JSON representations of Recovery reporting data convenient for many user interface and visualization applications.

  4. ARRA reports describe a variety of metrics such as job creation estimates and financial expenditures that can be graphed and visualized. They also contain some geo-spatial data that can be mapped in a variety of ways. While we have not provided examples graphing our simulated data, a map demonstration is provided here.

  5. Our demonstration implementation suggests ways to help make Recovery reporting data more meaningful for the public by providing additional services relating to linked data (see below). In our example, we provide links to resources to help further describe different entities referenced in ARRA reporting data. While we only provide examples relating to DUNS IDs and agency codes, there are many other entities referenced in Recovery reports that should also have linked data. The following section (below) explores this issue more.

Linked Data Services

Recovery reporting data contains references to several entities identified in various coding systems. These include DUNS ids for organizations, codes identifying different federal agencies, codes for different Treasury accounts and programs, and codes like ZIP codes and Congressional districts that identify different geographic regions. To help make Recovery data meaningful to the public, Recovery.gov should offer services that help explain the meaning of these codes. Recovery.gov should also offer services to retrieve data related to these coded entities. For example, the element TreasuryAccountSymbol in the XML of Recovery reports describes the treasury account used to fund a grant or contract. There should be a link to a Web resource offering data about each specific Treasury Account, ideally describing important details of when a given account was created, for what purpose, how much money it contains, etc. The information in such resources should be available in different representations, both for human consumption, and for machine use in XML format. An Atom feed can also be published for each Treasury Account to facilitate updated retrieval of reports and other data relating to each account.

Demonstration Recovery Feed

Link to the Demonstration "Data Feed" Atom icon