Views from a Blue Dot: Comet Neowise

On Saturday, we took a break from the pandemic to go outside and look for a comet. We live in a Dallas suburb, but one which has grown a lot in 10 years. The skies are not quite as dark as they used to be, but we thought it might be possible to spot and view Comet Neowise.

We set out just before 9pm to a local city park. Jodi had the binoculars I got from my grandfather when he passed away; I had the new DSLR camera Jodi got me for Christmas last year. The sky was still showing the last glow of sunset, and city lights coming on across North Texas was being gently scattered back down to Earth, creating a faint but irreducible haze in the sky. We found a good spot to try to see the comet. Jodi located it with an iOS skywatching app, and we waited for more darkness to settle.

While waiting, we took stock of the night sky. Planets and stars peeked out of the twilight Arcturus glowed orange overhead. Jupiter lit the sky closer to the southern horizon, with the four Galilean Moons clearly visible under even modest magnification. Our real prize was to be found just below the cup of the Big Dipper. As the sky conditions settled to just about the best possible, we started spotting the stars of the Big Dipper more closely and hunting for the comet.

We knew to start from Merak, the star that makes up the front lower edge of the dipper’s cup. Go straight down from Merak, and the comet would lie somewhere along that line. Indeed, once we employed the binoculars, the bright core of the comet and the fainter long arcing wisp of its tail were clear. This was incredibly thrilling; I’ve never had a chance to see a comet first-hand before.

I got the camera setup and aimed in the general area where the comet should appear. In particular, we noted that Neowise was framed by a triangular arrangement of background stars. Spotting those was hard on the camera, but after a few long exposures at high ISO (>1600), it was clear where to center the shot to best pickup the comet.

Comet Neowise

After a bunch of photos, we packed up and went home. It was 10pm, way past our bedtime. It had been worth it. With all madness raging down here on Earth, it’s nice to see a cosmic tourist taking a drive through the inner solar system. Neowise will not to return for several thousand years. When it comes back, I wonder if humanity will still be here to see it?

The New Normal (C+73)

It has been 73 days since Texas, racing to reopen without first putting in place large-scale testing and tracing infrastructure, ceded the population of the state to the SARS-CoV-2 virus. That was COVID Day, or C-Day: May 1, 2020.

On C-Day, Texas announced that businesses could re-open at 25% capacity; however, it did so without first preparing for the consequences. The reopening march proceeded then on a schedule driven, not by data and epidemiology, but by arbitrary mandates from the governor about when the next phase of opening should occur. Seven days after the first step, salons were allowed to reopen. This was despite the fact that if the first step causes new cases in the community, they would not show up for 14-21 days after the step; 7 days was just too soon to tell, and thus to move to the next step. That is what is meant by “reopening too fast” – taking the next step before the last one’s effects can be measured.

The data did tell an interesting story, though, which I will get to later. The summary of all of this is that my county, Collin County, has (like most urban parts of Texas) experienced slow, then rapid, growth in the spread of SARS-CoV-2. The data suggest that transmission started almost immediately as Texas began its reopening. Some factors might have contributed to faster and more rapid spread of the virus.

Under the hood, at each choice, there is the slow burn of normal exponential growth mathematics. If the doubling time for a process is 20 days, then you will see small growth in the first 20 days; in the next 20 days, the number will double again; 20 days later, it will double again. What you might attribute at first to “normal growth” may, in fact, be the first signs of out-of-control exponential growth. Having enough data to tell the difference is the key to avoiding resurgence of a communicable disease.

Top Graph: The purple histogram (bars) are the new cases recorded each day in Collin County; the pink line is the 7-day rolling average of new cases; the black numbers indicate the doubling-time and is printed every few days in the graph. The blue curve is the 7-day rolling average of reported tests per day; the horizontal blue line is the level that the Harvard Global Health Institute has determined would allow Collin County to be open, with social distancing, and yet control the spread of SARS-CoV-2.

Bottom Graph: Vertical color lines indicate key events in Texas and Collin County, such as shelter-in-place orders or reopening stages; colored bands, 7-14 days after the color line, indicate the range of expected effects in the data. However, current evidence suggests the more accurate window may be 14-21 days after an event that can alter the pandemic’s course.

The above graph tells the tale of new cases per day. Those have skyrocketed in the last month, from about 25 per day in May to over 100 per day in June and now July.

Let’s pause for a moment and think about the math of exponential growth. If something is growing exponentially at a constant rate (e.g. a fixed doubling rate of, say, 20 days), then after every unit of 20 days the process has to yield more growth per day to maintain the rate. Exponential growth demands an increase, with time, of the appearance of new things each day. For example, let’s say that on day 0 you have 10 cases. At a doubling rate of 20 days, then 20 days later you will have 20 cases. On average, you added 0.5 cases per day during the first 20 days. On day 40, you expect to have 40 cases – twice what you had on day 20. From day 20 to day 40, you are now adding 1 case per day, on average – a higher rate of cases per day than in the first twenty days. On day 60, you expect to have 80 cases total … that is an average new-case rate of 2 cases per day during the third 20-day period.

In each new period of 20 days, you add more cases per day to maintain the 20-day doubling rate. That’s the demand of exponential growth.

So if you are dealing with a disease, and you add 25 cases per day for the first 20 days, then also again during the next 20 days, you are out of the realm of exponential growth. You won’t double the number of cases every 20 days; at a constant rate of adding new cases, it will take longer, and longer, and longer to double the total number of cases. That would be more of a sign that you have established a measure of control over the spread.

Which brings me to a bit of data analysis fun. I decided to employ a very basic strategy to understand whether or not certain choices made the spread of SARS-CoV-2 worse or better in Collin County. These are NOT the conclusions of an epidemiologist, and should not be substituted for input from actual experts in disease modeling (UT Southwestern, a premiere medical institution in Dallas, has an EXCELLENT model you should all pay attention to [1]). However, I think there is some useful information in this very primitive analysis.

Here is the idea. Pick an event, like the limited shelter-in-place order put in place in Collin County on March 24, 2020. Put a window in time around that event, and fit the number of new cases per days using a simple linear model, y = mx + b (where y is the number of cases per day, x is the day, m is the slope, and b is the y-intercept). I used a window of 3 days on either side, centered on the day of the event. This provides a measure of how the disease was spreading (the slope of the new cases per day) at the time of the event.

Then slide 14-21 days layer. By this time, the effects of the event from earlier would have begun to show up in the case counts, according to the current understanding of the infection rate, symptom emergence, and testing time. In that time window 14-21 days later, again fit a straight line and get the slope.

Most importantly, get the uncertainty on the slopes from the two linear fits, which is determined from the fitting procedure.

An example of the simple new-case linear fits from Collin County. On the left side is the new case growth rate in a window in time around the limited shelter-in-place order issued in Collin County. On the right is the 14-day window starting 14 days after the event showing case counts in a window of time when the effects of the earlier decision should show up. The slope, plus its uncertainty, is reported in each fit.

I then use a simple numerical measure of the impact:

  • If the original slope was positive, and the later slope is negative, then this appears to be a “positive reversal” of the original trend (things were getting worse and now they are getting better).
  • If the original slope was negative, and the later slope is positive, then this appears to a “negative reversal” of the original trend (things were going in the right direction, then got worse later, reversing the original improvement)
  • If the later slope is greater (less) than the original one, then things got worse (better).

The difference in the two slopes, subtracting the later slope from the original, is a measure of their similarity. The closer to zero the difference of the two numbers, the more similar the two periods are. The total uncertainty on the difference is the sum, in quadrature, of the two distinct uncertainties. The uncertainty on the slope is important; if two slopes agree to within a certain range of uncertainty, then it’s hard to say that they are different from one another; the more the slopes move apart outside of their range of uncertainty, the more certain we can be. I opted to designate the confidence in a given conclusion as follows:

  • If the difference in the slopes is compatible with zero within 1 unit of uncertainty, then we cannot really be certain about the conclusion; for all intents and purposes, the event had no discernable effect. I refer to this as “Unclear Impact”
  • If the difference is incompatible with zero at the level of 1-2 units of uncertainty, then the conclusion of the impact is drawn at “low confidence”
  • If the difference is incompatible with zero at the level of 2-3 units of uncertainty, then the conclusion is drawn at “moderate confidence”;
  • If the level of incompatibility is 3 or more units of uncertainty, then the conclusion is drawn at “high confidence.”

You would naively expect the first shelter-in-place orders to be a good test of this approach; they should have had a meaningful impact on the spread of the virus once people were forced to stay home (unless they were deemed “essential”).

  • March 24, 2020: the limited shelter-in-place order appears to have indeed had a strong effect; it reversed the trend at the time (from a slope of +1.78 cases/day to -1.4 cases/day) and at high confidence (the difference in the slopes deviated from zero by 10.2 units of uncertainty)
  • March 31, 2020: the County Judge in charge of Collin County ordered a full shelter-in-place, reducing the number of businesses considered “essential”. This had a positive impact (further driving the slope down from 0.81 cases/day at the time of the new order to 0.26 cases/day 14-21 days later), but at moderate confidence (2.2 units of uncertainty)

So that seems reasonable, given what we know about pandemic control for a communicable disease like COVID-19. What about what happened next?

  • April 3, 2020: CDC recommends face coverings be worn in public. At low confidence, this might have had a negative impact on the spread of disease (1.0 units of uncertainty from zero in the difference of slopes). It’s hard to draw a strong conclusion from this, but some reporting at the time suggested that people might be gaining a false sense of security from the mask recommendation and may have been more brash in their public comings and goings as a result.
  • April 20, 2020: Texas reopens its state parks. The impact is unclear, compatible with no impact at all. This appears to neither have been good nor bad for case counts each day.
  • April 24, 2020: Texas retail stores reopen for pickup and delivery. The spread of the disease seems to have weakened 14-21 days later, and this effect appeared at high confidence in the data (1.32 cases/day at the time this occurred, -0..18 cases/day later, difference from zero by 8.2 units of uncertainty).
  • May 1, 2020: Texas businesses reopen at 25% capacity (C-Day). This itself actually had an unclear impact on case rates, and was consistent with no effect.
  • May 8, 2020: Texas reopens salons. This caused a reversal of progress at high confidence. The trend was -0.06 cases/day at the time of this reopening, and 14-21 days later it was 0.81 cases/day (4.4 units of uncertainty from zero difference in slope). In fact, this seems to have been the first decision that clearly led to the beginning of the end for Texas, putting us on the path we are on now. However salon owners handled this, it did not go well for people in terms of the spread of SARS-CoV-2.
  • May 18, 2020: Texas gyms, offices, and manufacturing reopen at 25% capacity. This had an unclear impact, compatible with no impact at all.
  • May 22, 2020: Texas bars are reopened at 25% and restaurants at 50%. This had a negative effect at high confidence, from a slope of 0.17 cases/day to 2.51 cases per day, with a difference separated from zero by 3.3 units of uncertainty.
  • May 25, 2020: Memorial Day. This seems to be linked also to a strong negative impact, so close in time to the bar and restaurant decision, and was also a high-confidence effect.
  • June 3, 2020: Texas bars, offices, non-essential plants and factories, gyms, at 50%; theme parks reopen at low capacity. At moderate confidence, this step had a negative impact, increasing case rates from 0.39 to 1.53 cases/day (2.0 units of uncertainty)
  • June 12, 2020: Texas restaurants reopen at 75%. We are now well into the steady march of milestones with no measure to determine whether it was wise to take them. At high confidence (4.3 units of uncertainty), case rates went from 0.57 cases/day to 4.53 cases per day just 14-21 days later).
  • June 19, 2020: Texas reopens theme parks at 50% capacity. This had no clear impact on case rates.

We are now waiting to see what the governor’s pausing of the reopening has done, but it will take more time to see those effects in data (that happened about 2 weeks ago now, so the data is slowly rolling in each day).

It didn’t take much for me to accumulate and think about public data. I am not wholly happy with my treatment of errors in the analysis, and it might affect the confidence of some conclusions, but overall there is nothing particularly surprising in the results from what one might expect from a transmissible disease. Epidemiologists working for public and private health institutions are doing far better work than anything I have said here, and their predictions are grim: cases will continue to grow through July and possibly into mid-August.

I didn’t even get into the testing problem, the other end of the containment equation. The chart at the beginning of this post indicates that my county has managed to increase testing somewhat in the past 4 weeks, but pathetically so. The Harvard Global Health Institute now recommends Collin County perform almost 4500 tests each day to meet the target of controlling the spread. The county managed only to increase testing to the level of 1000 each day. The HGHI has been steadily upping its recommendations as the pandemic unfolds and worsens in the country; the county has increased efforts modestly in comparison. I understand it takes time to deploy this kind of capability, but now is the time for a Manhattan Project for pandemic response. What we have seen in Texas has been timid in comparison, and seemingly totally uninformed.

References

[1] https://www.utsouthwestern.edu/covid-19/about-virus-and-testing/forecasting-model.html

Not a Number

This is the second day in a row that Collin County, TX, has not reported or updated its COVID-19 case or death count (or any other statistics). It is – likely not coincidentally – two days since the State took over the handling of cases from the County. The question is: what is the state doing if not collecting information about new cases, deaths, recoveries, etc.?

I’ve only been able to enter NaN – not a number – into my charting of new cases in Collin County. Once data becomes available again, I will revise these placeholders or let the code interpolate between numbers, as needed. But it’s just not credible that there have been no new cases for two days, since new case counts began climbing late last week. While absence of evidence is not evidence of absence, the timing is terrible; neighboring Dallas County today reported its record high number of new cases and deaths for one day. [1]

[1] https://www.dallasnews.com/news/public-health/2020/06/02/dallas-county-reports-record-number-of-new-covid-19-cases-deaths/

The Muon: 1970

In 1970, Hall, Lind, and Ristenen (Univ. of Colorado at Boulder) published a paper in the American Journal of Physics (AJP, vol. 38, No. 10) on “A Simplified Muon Lifetime Experiment for the Instructional Laboratory.” Basically, it articulates precisely the experiment at the heart of a similar instrument at SMU. Muons are produced in cosmic rays raining down on the atmosphere. Some muons make it all the way to sea level. Some of those are moving slowly enough to be stopped when passing through material. If that material gives off light in response to the slowing, stopping, and then decay of the muon, it is possible to use the light to measurement the lifetime of the muon.

An excerpt from the Hall et al. paper, showing their collected data (counts vs. channel, where one channel represents about 100ns of time) and the results of a least-squares fit to the data to extract the lifetime of the muion.

Hall et al. reported on a run of their experiment of 695 hours (about 29 days!). I’ve had nothing but time on my hands, and after discovering the Hall paper when I started playing around with the SMU instrument I was inspired to repeat their experiment.

Data from the SMU muon detector.

As of today, I have 695 hours of data from the muon detector at SMU. Based on a model fitted to the data (an exponential decay function added to a flat background), I find the lifetime of the muon to be 2170 \pm 29 nanoseconds (ns). The accepted lifetime is 2196 ns. The Hall et. al result using a similar but earlier version of the experiment found 2106 \pm 58. (note: they quote the half-life, but that is easily converted to the lifetime [average life of the muon] by dividing the half-life by ln(2)).

In 1970, as now, the lifetime of the muon has not changed within the resolution of two 695h data sets, taken independently and 50 years apart. There is a wonder in the power of scientific investigation to reveal those things that are steady and constant in the cosmos.