We used five datasets to accomodate our analysis; education/employment, County populations, Income, Presidential Election results, cartographic shapefiles, and finally a large Twitter scrape corpus. Aside from the Twitter scrape provided by Proffesor Caglar all of the data was provided by various U.S. governments agencies. For our data-driven analysis, we sought to paint broad strokes on the tweets worldwide, the U.S. reactions, and how county attributes might be in anyway correlated.
Caglar's scrape consists of a tap into Twitter's stream, filtering out only those tweets with the hashtags #immigration, #ban, #BuildTheWall, etc. The tweets were selected to guage the reactionairy sentiment of immigration. Unfortunately the data had to be filtered down severely to comply with GitHub's restrictions. As such, this can be considered an end-user application instead of a insight-heavy prototype.
On Janurary 27th, 2017 President Trump ennacted Executive Order 13769. People both at home and abroad were shocked. This map is a 10,000ft view of how people reacted to the order. Notice the sharp spike?
The Geospatial data used for this project was obtained from the United States Census Bureau. The cartographic boundary shapefiles were created in 2016. A shapefile of the United States Counties, as well as a shapefile of the 50 states of the United States was used.
To establish a baseline understanding of each counties make up we grabbed generic population data about U.S. counties produced by the United States Census Bureau. Via the the American Fact Finder website we investigated population characteristics. To constrain our problem scope we opted to focus only on population and racial breakdown. The values for race included White, Black, American Indian, Asian, Pacific Islander, and Multi Race. Included was the county population. All the population data came from the 2015 ACS 1-year estimates.
Both the education and employment data were provided by the United States Department of Agriculture's Economic Research Service on ers.usda.gov. For employment we obtained the following 2015 county-level attributes: total labor force population, total employed population, total unemployed population, unemployment rate and Median Household Income (U.S. dollars).
To evaluate counties' education attributes we gathered: Percent of population with less than a High School diploma, Percent of the population with only a high school diploma, percent of the population with some college experience and percent of population with a bachelors degree or higher. The education attributes are displayed as an average from 2011-2015.
We obtained the 2016 U.S. Presidential Election data from precompiled sources(thanks @tonmcg!). The following attributes were obtained and used: total democratic votes, total republican votes, and total votes by county. The rates of votes for both democrat and republican on the county level are calculated by us. And finally, the vote differential between the leading party and the runner up, and the percent point differential were obtained. The raw data was obtained from TownHall.com, which was then cleaned and made accessible on GitHub.