My mentor, Dr. Nelson, and I discussed potential research projects.
We decided on researching resource allocation during the Covid-19 Pandemic.
Researching a variety of subjects was the first step: Medical ethics, various forms of
text regression, and medical ethics in a pandemic. With this information, I started to create various model plans for
my research project and began to create a presentation illustrating my work for my peers.
Week 2: (6.1.2020 - 6.7.2020)
The beginning of week two marked the finalization of my presentation,
as well as presenting my work done in week one to my peers.
This week will encompass the bulk of my data collection using a variety of tools and resources such
Twitter API (application programming interface) and the CDC's comprehensive database.
Also utilizing Tableau software to understand data and visualize it.
Week 3: (6.8.2020 - 6.14.2020)
More into the data gathering process for week three. Population, state
area, and county area was collected to calculate population density of each county
and each state. Tableau was utilized to visualize how rate of positive test results increased
as time progressed, as well as visualizing Mobility Patterns for each county. This week also marked the
start of writing code to average week data for states and county (starting on February 15th, 2020).
Weekly Data includes (for States):
Mobility Changes from baseline
Retail
Grocery
Parks
Transit
Workplaces
Residential
Averages of all mobility data
Positive and Negative Test Results
Regulations from State Authorities
Week 4: (6.15.2020 - 6.21.2020)
Data collection for this week was focused mainly on medical facilities and care facilities. Looking to see
what qualified as a medical/care facility, what said facilities needed to PPE, and when to use the PPE. Another aspect of the data
collection and project is finding out how to safely and effectively use PPE (optimization of PPE). Need for PPE increases as exposure risks
increase. Information of this nature was collected from the CDC and OSHA guidelines.
Week 5: (6.22.2020 - 6.28.2020)
Data cleaning occurred in week five. For counties, weekly entries were made and any missing data was filled. I was able to get approved to be
a twitter developer, and create an app that allowed me to utilize Twitter API (Application Processing Interface). I collected data sets of
Tweets relating to Covid-19, these data sets contained Tweet IDs (unique number correlating to a tweet, that functions as an ID). Twitter does not allow
for the distribution of full JSON datasets containing all information from tweet to third parities, but Twitter does allow the distribution of data sets only
containing Tweet IDs. These data sets are considered to be filled with "dehydrated Tweets", and this week I was able to "hydrate" these datasets to looks at all
of the information that was contained in them.
These hydrated Tweets now contains:
Date posted and time stamp
Coordinates of the where the Tweet was made
The text of the Tweet itself and hashtags
Description of the users (bios)
Screen names of users
Follower count of users
With this information I will be able to preform LDA to help in the resource allocation part of my project.
Week 6: (6.29.20 - 7.5.20)
This week marked the start of statistical modeling with LDA (Latent Dirichlet allocation). Which draws connections to different subjects based on importance and frequency in the
text it is preformed on. For the newly rehydrated Twitter IDs, I preformed LDA on weekly collections. Then, I preformed LDA again on the weeks, disregarding "grab" words
or, words that would probably be the most frequent in the text but the least helpful in providing insight on this different needs of the people who sent out the tweets.
These words included, “corona”, "coronavirus", "covid", "pandemic", and the different variants seen from "sarscov2", "nCov", "covid-19", "ncov2019", and "2019ncov".
After disregarding these words, different information on word frequency was seen. Words like "mask" and "help" were seen as LDA topics.
Week 7: (7.6.20 - 7.12.20)
Word clouds
After LDA processing, word frequency was taken into account, which entailed iterating through the text of the Twitter Data and calculating word frequency to
created word clouds for each week of the data, this word frequency just like the previous process of LDA disregarded "grab" words.
After plotting the number of Geo-tagged tweets per week in comparison to the number of cases in the US, there seemed to be a inverse relationship. Meaning
where there were spikes in cases, there was dips in the number of Tweets - this may indicate a lag in public response to the number of cases.
Week 8: (7.13.20 - 7.19.20)
After preforming LDA and word frequency analysis, data was collected regarding State mandates. Data sets were downloaded as well as transcripts from
governors of states to create a large database of mandates with the dates they were issued by state. These mandates included:
Masks Usage and Enforcement
School Closures
Stay-At-Home orders
Non-Essential Buisness Closures
Resturant Closures
Bar Closures
These state mandates were then put through a grading system created by me, based on risk of person to person transmission.
Masks Usage and Enforcement
No mandates issued
Mask mandate
Mask mandate enforced
School Closures
No mandates issued
Closed K-12s
Closed Day cares
Reopened Day cares
Stay-At-Home orders
No Mandates issued
Stay-At-Home mandate
Ended or Relaxed mandate
Non-Essential Buisness Closures
No Mandates issued
Closure of all non-essential business
Re-open with no masks, employees or otherwise
Re-open with masks, employees only
Re-open with masks, employees and public
Re-open with masks, employees and public enforced
Resturant Closures
No Mandates issued
Closure of food establishments, except take-out
Re-open with no masks, employees or otherwise
Re-open with masks, employees only
Re-open with masks, employees and public
Re-open with masks, employees and public enforced
Bar Closures
No Mandates issued
Closure of all non-essential business or re-closure
Re-open with no masks, employees or otherwise
Re-open with masks, employees only
Re-open with masks, employees and public
Re-open with masks, employees and public enforced
States with higher grades had more mandates to combat person-to-person transmission.
Week 9: (7.20.20 - 7.24.20)
Week 9 entailed creating a risk calculator for individuals in counties. This was calculated by getting weekly counts of cases seen in each county,
dividing by the population of the county, and multiply by 100. These Calculations where then put into visualization software.