Los Angeles covid data cooks the books (VIII)

It's been a while since I've posted. The difference between the daily press release numbers (the "we heard about these deaths today" numbers) and the carefully placed dashboard numbers (the "we did the work, and this is the day on the death certificate" numbers) remains widespread. It's been running around 13% pretty much every day now, and the graph looks like this:

Since I last posted, I found another data page. This one is corona-virus.la, and is supposed to be the city's data tracker. They also list the county side-by-side. This is from today:

The first thing to notice is that the city numbers are much smaller than the county. For cases, the city of LA is only 38% of the county, and 28% of the deaths. Of course, we are waiting for the whole, massive county, instead of relying on the city data. Here's a map of the city and county:

The second thing to notice is more subtle. Looking at the numbers for the county, they perfectly match the raw press release numbers. In other words, they are relying on the bad "we heard about it" data, not the clean death-certificate data.

How much difference does that make? If you go to the county dashboard (which is on publichealth.lacounty.gov), you can download the data. Because this data is carefully placed, you do have to wait a couple of days for the data to get cleaned. But up there on the graphic above is the data from the week of 9/13 - 9/19: 11, 24, 37, 31, 38, 22, and 13. That averages to 26.6. But if you look at the same dates on the dashboard, you get: 19, 14, 16, 18, 14, 11, 11. Which averages to 14.7

That's an overstatement of 83%.

Because they are looking at bad data, they are looking at numbers 83% higher than the good data that they also have. This isn't the case of hand-waving statistics that shouldn't be used in that way: this is literally about using the data that they have carefully vetted and maintained, instead of the data that they know and admit is bad.

If you go to the dashboard (publicheath.lacounty.gov), you'll note a gray area on the right and some footnotes on the graph:

Those footnotes might be a bit small, but they say:

Recent dates are incomplete due to lags in reporting. The gray box corresponds to dates that are likely to not yet be reported completely.

Cases reported by Episode Date which is the earliest existing value of: Date of Onset, Date of Diagnosis, Date of Death, Date Received, Specimen Collection Date. Deaths reported by Date of Death or Date Received if Date of Death is missing

Number of daily cases will not match the number of newly reported Los Angeles County cases as episode date reflects date of underlying illness rather than date of report.

So, if we look only at the dates to the left of the gray area, we have everything including 9/20. If we look at the average for the week ending 9/20 and the prior week we have averages of: 22.4 and 13.3, with a decrease of -40.6%. On corona-virus.la (the one above with the column graphs) what do they have for the change in from the previous to the current week? +8.1%.

I can not begin to say how much using the wrong numbers is pissing me off. Businesses are being destroyed--lives are being destroyed--every single day we remain closed down at a high level of lockdown.

0 comments: