Thread: I've written 2 public health data standards, studied 1,000 public health info systems, consulted to open data companies & federal agencies on open, public/enviro health data standards.
Recently worked on COVID data challenges.
This order is... https://www.whitehouse.gov/briefing-room/presidential-actions/2021/01/21/executive-order-ensuring-a-data-driven-response-to-covid-19-and-future-high-consequence-public-health-threats/
Recently worked on COVID data challenges.
This order is... https://www.whitehouse.gov/briefing-room/presidential-actions/2021/01/21/executive-order-ensuring-a-data-driven-response-to-covid-19-and-future-high-consequence-public-health-threats/
...an ok, but modest start.
America's public health data challenges, including COVID data, stem from our system of local control of public health data collection infrastructure. Technology vendors rarely support machine-readable export.
This causes delays in relaying info...
America's public health data challenges, including COVID data, stem from our system of local control of public health data collection infrastructure. Technology vendors rarely support machine-readable export.
This causes delays in relaying info...
...and requires limited state PH staff to normalize data before publicly reporting it.
All major civil-society COVID data resources require human scraping of COVID data--often @ the local level & automated scraping at the state level.
But it gets worse--they're scraping...
All major civil-society COVID data resources require human scraping of COVID data--often @ the local level & automated scraping at the state level.
But it gets worse--they're scraping...
COVID dashboards. The major technology providers who have contracts for these COVID dashboards ARE NOT placing the same data in #OpenData, at most queryable databases which frequently break automated scrapers.
What does this process from local to federal gov't look like?
What does this process from local to federal gov't look like?
Data path: County public health agency COVID data in HTML/social media/fax/PDF-->scraped by volunteers at civil society/journalist orgs + state COVID data in HTML/social media/fax/PDF/dashboards-->civil society/journo publishing & open data-->CDC & federal agencies.
YES, the CDC uses volunteer scraped, normalized, & republished COVID data because it's better/cheaper/faster than what the CDC can produce itself. It's been that way since these volunteer efforts started.
Dashboard providers are in the dark about the bottle neck they created.
Dashboard providers are in the dark about the bottle neck they created.
In my experience, this is a result of the tools LOCAL county/municipal health departments have access to.
I've worked with governments whose vendors tell them it'll be $30,000 to EXPORT THEIR OWN DATA. Most internal systems just publish to Word, PDF, or a vendor dashboard.
I've worked with governments whose vendors tell them it'll be $30,000 to EXPORT THEIR OWN DATA. Most internal systems just publish to Word, PDF, or a vendor dashboard.
So, if we want to fix the COVID data problem, this means addressing the root-level of COVID data challenges---and going straight at the vendors causing pain for local health departments. I believe the defense production act should be applied to public health databases and IT.
And I think there are a few additional solutions:
1) Require local public health and state public health agencies to publish a minimum number of required fields in #OpenData or CSV files (if they don't have access to open data portals, which is a LOT of them).
2) Provide a...
1) Require local public health and state public health agencies to publish a minimum number of required fields in #OpenData or CSV files (if they don't have access to open data portals, which is a LOT of them).
2) Provide a...
data map local data fields to an 1.0 of a data standard.
3) Provide an alt for local & state goes to directly enter daily data to a federal system. Similar to the hospital/HHS system. (Hospitals hated at first but works well now.)
4) Publish all to http://data.gov
3) Provide an alt for local & state goes to directly enter daily data to a federal system. Similar to the hospital/HHS system. (Hospitals hated at first but works well now.)
4) Publish all to http://data.gov
Finally, the U.S. needs to align COVID data with any international, emerging standards on COVID data.
I'll publish more insights based on my recent work, and I'm looking forward to engaging others on potential paths forward for COVID data.
I'll publish more insights based on my recent work, and I'm looking forward to engaging others on potential paths forward for COVID data.
PS: If you're wondering why points 1 & 2 (CSV & mapping), it's because this is the foundation for any ETL process (Extract, Transform, Load). So, the CSVs & open data could be automatically transformed. Most "standardized" data in the US is unstandardized data run through ETL.