Mapping homelessness in England

Introduction

For this blog post, I decided to try to find a dataset covering an issue I feel quite strongly about - homelessness. I managed to find a fairly large dataset from the Cambridgeshire Insight website.

For a while I’ve wanted to try out R’s mapping potential and hopefully generate a heatmap, so I’ve deliberately tried to find a dataset where I can try this out. It’s worth saying that this activity has been the most difficult and frustrating project I’ve taken on by far. It’s taken me 6 or 7 sessions to produce this blog, in which the first was me trying to install gganimate (which I ended up not using) and figuring out where to start with mapping.

Data wrangling

Let’s load the required packages and read the data in:

library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.3.0
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(gifski)
library(sf)
## Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3
data <- read_csv("https://data.cambridgeshireinsight.org.uk/sites/default/files/P1E-national-homelessness-CLG-tab784-to-201718-csv.csv")
## Parsed with column specification:
## cols(
##   .default = col_double(),
##   `ONS code` = col_character(),
##   `Local authority area` = col_character(),
##   Region = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are White` = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are Black or Black British` = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are Asian or Asian British` = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are Mixed` = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are Other ethnic origin` = col_character(),
##   `2009/10: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated` = col_character(),
##   `2009/10: Total decisions where eligible homeless & in priority need but intentionally` = col_character(),
##   `2009/10: Total decisions where eligible & homeless but not in priority need` = col_character(),
##   `2009/10: Total decisions where eligible but not homeless` = col_character(),
##   `2009/10: Total homelessness decisions` = col_character(),
##   `31 March 2010: Total households in B&B (including shared annex)` = col_character(),
##   `31 March 2010: Total households in hostels` = col_character(),
##   `31 March 2010: Total households in LA/HA stock` = col_character(),
##   `31 March 2010: Total households in private sector leased (by LA or HA)` = col_character(),
##   `31 March 2010: Total households in other temp (including private landlord)` = col_character(),
##   `2015/16: Numbers accepted as homeless and in priority need who are White` = col_character(),
##   `2015/16: Numbers accepted as homeless and in priority need who are Black or Black British` = col_character()
##   # ... with 58 more columns
## )
## See spec(...) for full column specifications.
names(data)
##   [1] "ONS code"                                                                                  
##   [2] "Local authority area"                                                                      
##   [3] "Region"                                                                                    
##   [4] "2009/10: Thousands of households 2006 mid-year estimate"                                   
##   [5] "2009/10: Numbers accepted as homeless and in priority need who are White"                  
##   [6] "2009/10: Numbers accepted as homeless and in priority need who are Black or Black British" 
##   [7] "2009/10: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
##   [8] "2009/10: Numbers accepted as homeless and in priority need who are Mixed"                  
##   [9] "2009/10: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
##  [10] "2009/10: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
##  [11] "2009/10: Numbers accepted as homeless and in priority need total"                          
##  [12] "2009/10: Number accepted per 1000 households"                                              
##  [13] "2009/10: Total decisions where eligible homeless & in priority need but intentionally"     
##  [14] "2009/10: Total decisions where eligible & homeless but not in priority need"               
##  [15] "2009/10: Total decisions where eligible but not homeless"                                  
##  [16] "2009/10: Total homelessness decisions"                                                     
##  [17] "31 March 2010: Total households in B&B (including shared annex)"                           
##  [18] "31 March 2010: Total households in hostels"                                                
##  [19] "31 March 2010: Total households in LA/HA stock"                                            
##  [20] "31 March 2010: Total households in private sector leased (by LA or HA)"                    
##  [21] "31 March 2010: Total households in other temp (including private landlord)"                
##  [22] "31 March 2010: Total households in temporary accommodation"                                
##  [23] "31 March 2010: Number in temp per 1000 households"                                         
##  [24] "2009/10: Duty owed but no accommodation has been secured at end of March 2010"             
##  [25] "2010/11: Thousands of households 2008 mid-year estimate"                                   
##  [26] "2010/11: Numbers accepted as homeless and in priority need who are White"                  
##  [27] "2010/11: Numbers accepted as homeless and in priority need who are Black or Black British" 
##  [28] "2010/11: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
##  [29] "2010/11: Numbers accepted as homeless and in priority need who are Mixed"                  
##  [30] "2010/11: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
##  [31] "2010/11: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
##  [32] "2010/11: Numbers accepted as homeless and in priority need total"                          
##  [33] "2010/11: Number accepted per 1000 households"                                              
##  [34] "2010/11: Total decisions where eligible homeless & in priority need but intentionally"     
##  [35] "2010/11: Total decisions where eligible & homeless but not in priority need"               
##  [36] "2010/11: Total decisions where eligible but not homeless"                                  
##  [37] "2010/11: Total homelessness decisions"                                                     
##  [38] "31 March 2011: Total households in B&B (including shared annex)"                           
##  [39] "31 March 2011: Total households in hostels"                                                
##  [40] "31 March 2011: Total households in LA/HA stock"                                            
##  [41] "31 March 2011: Total households in private sector leased (by LA or HA)"                    
##  [42] "31 March 2011: Total households in other temp (including private landlord)"                
##  [43] "31 March 2011: Total households in temporary accommodation"                                
##  [44] "31 March 2011: Number in temp per 1000 households"                                         
##  [45] "2010/11: Duty owed but no accommodation has been secured at end of March 2011"             
##  [46] "2011/12: Thousands of households 2008 mid-year estimate"                                   
##  [47] "2011/12: Numbers accepted as homeless and in priority need who are White"                  
##  [48] "2011/12: Numbers accepted as homeless and in priority need who are Black or Black British" 
##  [49] "2011/12: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
##  [50] "2011/12: Numbers accepted as homeless and in priority need who are Mixed"                  
##  [51] "2011/12: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
##  [52] "2011/12: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
##  [53] "2011/12: Numbers accepted as homeless and in priority need total"                          
##  [54] "2011/12: Number accepted per 1000 households"                                              
##  [55] "2011/12: Total decisions where eligible homeless & in priority need but intentionally"     
##  [56] "2011/12: Total decisions where eligible & homeless but not in priority need"               
##  [57] "2011/12: Total decisions where eligible but not homeless"                                  
##  [58] "2011/12: Total homelessness decisions"                                                     
##  [59] "31 March 2012: Total households in B&B (including shared annex)"                           
##  [60] "31 March 2012: Total households in hostels"                                                
##  [61] "31 March 2012: Total households in LA/HA stock"                                            
##  [62] "31 March 2012: Total households in private sector leased (by LA or HA)"                    
##  [63] "31 March 2012: Total households in other temp (including private landlord)"                
##  [64] "31 March 2012: Total households in temporary accommodation"                                
##  [65] "31 March 2012: Number in temp per 1000 households"                                         
##  [66] "2011/12: Duty owed but no accommodation has been secured at end of March 2012"             
##  [67] "2012/13: Thousands of households 2008-based interim projections for 2012"                  
##  [68] "2012/13: Numbers accepted as homeless and in priority need who are White"                  
##  [69] "2012/13: Numbers accepted as homeless and in priority need who are Black or Black British" 
##  [70] "2012/13: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
##  [71] "2012/13: Numbers accepted as homeless and in priority need who are Mixed"                  
##  [72] "2012/13: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
##  [73] "2012/13: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
##  [74] "2012/13: Numbers accepted as homeless and in priority need total"                          
##  [75] "2012/13: Number accepted per 1000 households"                                              
##  [76] "2012/13: Total decisions where eligible homeless & in priority need but intentionally"     
##  [77] "2012/13: Total decisions where eligible & homeless but not in priority need"               
##  [78] "2012/13: Total decisions where eligible but not homeless"                                  
##  [79] "2012/13: Total homelessness decisions"                                                     
##  [80] "31 March 2013: Total households in B&B (including shared annex)"                           
##  [81] "31 March 2013: Total households in hostels"                                                
##  [82] "31 March 2013: Total households in LA/HA stock"                                            
##  [83] "31 March 2013: Total households in private sector leased (by LA or HA)"                    
##  [84] "31 March 2013: Total households in other temp (including private landlord)"                
##  [85] "31 March 2013: Total households in temporary accommodation"                                
##  [86] "31 March 2013: Number in temp per 1000 households"                                         
##  [87] "2012/13: Duty owed but no accommodation has been secured at end of March 2013"             
##  [88] "2013/14: Thousands of households 2012-based interim projections for 2013"                  
##  [89] "2013/14: Numbers accepted as homeless and in priority need who are White"                  
##  [90] "2013/14: Numbers accepted as homeless and in priority need who are Black or Black British" 
##  [91] "2013/14: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
##  [92] "2013/14: Numbers accepted as homeless and in priority need who are Mixed"                  
##  [93] "2013/14: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
##  [94] "2013/14: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
##  [95] "2013/14: Numbers accepted as homeless and in priority need total"                          
##  [96] "2013/14: Number accepted per 1000 households"                                              
##  [97] "2013/14: Total decisions where eligible homeless & in priority need but intentionally"     
##  [98] "2013/14: Total decisions where eligible & homeless but not in priority need"               
##  [99] "2013/14: Total decisions where eligible but not homeless"                                  
## [100] "2013/14: Total homelessness decisions"                                                     
## [101] "31 March 2014: Total households in B&B (including shared annex)"                           
## [102] "31 March 2014: Total households in hostels"                                                
## [103] "31 March 2014: Total households in LA/HA stock"                                            
## [104] "31 March 2014: Total households in private sector leased (by LA or HA)"                    
## [105] "31 March 2014: Total households in other temp (including private landlord)"                
## [106] "31 March 2014: Total households in temporary accommodation"                                
## [107] "31 March 2014: Number in temp per 1000 households"                                         
## [108] "2013/14: Duty owed but no accommodation has been secured at end of March 2014"             
## [109] "2014/15: Thousands of households 2012-based interim projections for 2014"                  
## [110] "2014/15: Numbers accepted as homeless and in priority need who are White"                  
## [111] "2014/15: Numbers accepted as homeless and in priority need who are Black or Black British" 
## [112] "2014/15: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
## [113] "2014/15: Numbers accepted as homeless and in priority need who are Mixed"                  
## [114] "2014/15: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
## [115] "2014/15: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
## [116] "2014/15: Numbers accepted as homeless and in priority need total"                          
## [117] "2014/15: Number accepted per 1000 households"                                              
## [118] "2014/15: Total decisions where eligible homeless & in priority need but intentionally"     
## [119] "2014/15: Total decisions where eligible & homeless but not in priority need"               
## [120] "2014/15: Total decisions where eligible but not homeless"                                  
## [121] "2014/15: Total homelessness decisions"                                                     
## [122] "31 March 2015: Total households in B&B (including shared annex)"                           
## [123] "31 March 2015: Total households in hostels"                                                
## [124] "31 March 2015: Total households in LA/HA stock"                                            
## [125] "31 March 2015: Total households in private sector leased (by LA or HA)"                    
## [126] "31 March 2015: Total households in other temp (including private landlord)"                
## [127] "31 March 2015: Total households in temporary accommodation"                                
## [128] "31 March 2015: Number in temp per 1000 households"                                         
## [129] "2014/15: Duty owed but no accommodation has been secured at end of March 2015"             
## [130] "2015/16: Thousands of households 2012-based interim projections for 2015"                  
## [131] "2015/16: Numbers accepted as homeless and in priority need who are White"                  
## [132] "2015/16: Numbers accepted as homeless and in priority need who are Black or Black British" 
## [133] "2015/16: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
## [134] "2015/16: Numbers accepted as homeless and in priority need who are Mixed"                  
## [135] "2015/16: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
## [136] "2015/16: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
## [137] "2015/16: Numbers accepted as homeless and in priority need total"                          
## [138] "2015/16: Number accepted per 1000 households"                                              
## [139] "2015/16: Total decisions where eligible homeless & in priority need but intentionally"     
## [140] "2015/16: Total decisions where eligible & homeless but not in priority need"               
## [141] "2015/16: Total decisions where eligible but not homeless"                                  
## [142] "2015/16: Total homelessness decisions"                                                     
## [143] "31 March 2016: Total households in B&B (including shared annex)"                           
## [144] "31 March 2016: Total households in hostels"                                                
## [145] "31 March 2016: Total households in LA/HA stock"                                            
## [146] "31 March 2016: Total households in private sector leased (by LA or HA)"                    
## [147] "31 March 2016: Total households in other temp (including private landlord)"                
## [148] "31 March 2016: Total households in temporary accommodation"                                
## [149] "31 March 2016: Number in temp per 1000 households"                                         
## [150] "2015/16: Duty owed but no accommodation has been secured at end of March 2015"             
## [151] "2016/17: Thousands of households 2012-based interim projections for 2016"                  
## [152] "2016/17: Numbers accepted as homeless and in priority need who are White"                  
## [153] "2016/17: Numbers accepted as homeless and in priority need who are Black or Black British" 
## [154] "2016/17: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
## [155] "2016/17: Numbers accepted as homeless and in priority need who are Mixed"                  
## [156] "2016/17: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
## [157] "2016/17: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
## [158] "2016/17: Numbers accepted as homeless and in priority need total"                          
## [159] "2016/17: Number accepted per 1000 households"                                              
## [160] "2016/17: Total decisions where eligible homeless & in priority need but intentionally"     
## [161] "2016/17: Total decisions where eligible & homeless but not in priority need"               
## [162] "2016/17: Total decisions where eligible but not homeless"                                  
## [163] "2016/17: Total homelessness decisions"                                                     
## [164] "31 March 2017: Total households in B&B (including shared annex)"                           
## [165] "31 March 2017: Total households in hostels"                                                
## [166] "31 March 2017: Total households in LA/HA stock"                                            
## [167] "31 March 2017: Total households in private sector leased (by LA or HA)"                    
## [168] "31 March 2017: Total households in other temp (including private landlord)"                
## [169] "31 March 2017: Total households in temporary accommodation"                                
## [170] "31 March 2017: Number in temp per 1000 households"                                         
## [171] "2016/17: Duty owed but no accommodation has been secured at end of March 2017"             
## [172] "2017/18: Thousands of households 2012-based interim projections for 2017"                  
## [173] "2017/18: Numbers accepted as homeless and in priority need who are White"                  
## [174] "2017/18: Numbers accepted as homeless and in priority need who are Black or Black British" 
## [175] "2017/18: Numbers accepted as homeless and in priority need who are Asian or Asian British" 
## [176] "2017/18: Numbers accepted as homeless and in priority need who are Mixed"                  
## [177] "2017/18: Numbers accepted as homeless and in priority need who are Other ethnic origin"    
## [178] "2017/18: Numbers accepted as homeless and in priority need who are Ethnic Group not Stated"
## [179] "2017/18: Numbers accepted as homeless and in priority need total"                          
## [180] "2017/18: Number accepted per 1000 households"                                              
## [181] "2017/18: Total decisions where eligible homeless & in priority need but intentionally"     
## [182] "2017/18: Total decisions where eligible & homeless but not in priority need"               
## [183] "2017/18: Total decisions where eligible but not homeless"                                  
## [184] "2017/18: Total homelessness decisions"                                                     
## [185] "31 March 2018: Total households in B&B (including shared annex)"                           
## [186] "31 March 2018: Total households in hostels"                                                
## [187] "31 March 2018: Total households in LA/HA stock"                                            
## [188] "31 March 2018: Total households in private sector leased (by LA or HA)"                    
## [189] "31 March 2018: Total households in other temp (including private landlord)"                
## [190] "31 March 2018: Total households in temporary accommodation"                                
## [191] "31 March 2018: Number in temp per 1000 households"                                         
## [192] "2017/18: Duty owed but no accommodation has been secured at end of March 2018"

The first thing to do is to try to hone in on some data I’d like to use. A quick scan of the columns and the “Local authority area” looks critical, and I’d like to see if I have yearly data for “Numbers accepted as homeless and in priority need total”:

ind <- str_detect(names(data), "priority need total")
names(data)[ind]
## [1] "2009/10: Numbers accepted as homeless and in priority need total"
## [2] "2010/11: Numbers accepted as homeless and in priority need total"
## [3] "2011/12: Numbers accepted as homeless and in priority need total"
## [4] "2012/13: Numbers accepted as homeless and in priority need total"
## [5] "2013/14: Numbers accepted as homeless and in priority need total"
## [6] "2014/15: Numbers accepted as homeless and in priority need total"
## [7] "2015/16: Numbers accepted as homeless and in priority need total"
## [8] "2016/17: Numbers accepted as homeless and in priority need total"
## [9] "2017/18: Numbers accepted as homeless and in priority need total"

This looks to fit the bill. Now I’ve honed in on the columns I need, let’s have a look at the structure and distribution of the data:

data_trim <- data %>% select(2, names(data)[ind])

str(data_trim, give.attr = FALSE)
## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 65447 obs. of  10 variables:
##  $ Local authority area                                            : chr  "ENGLAND" "Adur" "Allerdale" "Amber Valley" ...
##  $ 2009/10: Numbers accepted as homeless and in priority need total: num  40020 71 102 30 52 ...
##  $ 2010/11: Numbers accepted as homeless and in priority need total: num  44160 90 104 46 79 ...
##  $ 2011/12: Numbers accepted as homeless and in priority need total: num  50290 58 63 53 100 ...
##  $ 2012/13: Numbers accepted as homeless and in priority need total: num  53770 37 41 61 129 ...
##  $ 2013/14: Numbers accepted as homeless and in priority need total: num  52290 10 26 64 109 ...
##  $ 2014/15: Numbers accepted as homeless and in priority need total: num  54430 7 30 117 191 ...
##  $ 2015/16: Numbers accepted as homeless and in priority need total: chr  "57740" "16" "32" "101" ...
##  $ 2016/17: Numbers accepted as homeless and in priority need total: chr  "59110" "31" "17" "81" ...
##  $ 2017/18: Numbers accepted as homeless and in priority need total: chr  "56580" "38" "22" "76" ...
summary(data_trim)
##  Local authority area
##  Length:65447        
##  Class :character    
##  Mode  :character    
##                      
##                      
##                      
##                      
##  2009/10: Numbers accepted as homeless and in priority need total
##  Min.   :    1.0                                                 
##  1st Qu.:   30.0                                                 
##  Median :   63.0                                                 
##  Mean   :  244.8                                                 
##  3rd Qu.:  136.0                                                 
##  Max.   :40020.0                                                 
##  NA's   :65120                                                   
##  2010/11: Numbers accepted as homeless and in priority need total
##  Min.   :    1.0                                                 
##  1st Qu.:   36.5                                                 
##  Median :   73.0                                                 
##  Mean   :  270.1                                                 
##  3rd Qu.:  149.0                                                 
##  Max.   :44160.0                                                 
##  NA's   :65120                                                   
##  2011/12: Numbers accepted as homeless and in priority need total
##  Min.   :    0.0                                                 
##  1st Qu.:   41.0                                                 
##  Median :   85.0                                                 
##  Mean   :  307.6                                                 
##  3rd Qu.:  168.0                                                 
##  Max.   :50290.0                                                 
##  NA's   :65120                                                   
##  2012/13: Numbers accepted as homeless and in priority need total
##  Min.   :    0.0                                                 
##  1st Qu.:   38.0                                                 
##  Median :   78.0                                                 
##  Mean   :  326.4                                                 
##  3rd Qu.:  178.5                                                 
##  Max.   :53770.0                                                 
##  NA's   :65120                                                   
##  2013/14: Numbers accepted as homeless and in priority need total
##  Min.   :    0.0                                                 
##  1st Qu.:   38.5                                                 
##  Median :   82.0                                                 
##  Mean   :  319.8                                                 
##  3rd Qu.:  174.5                                                 
##  Max.   :52290.0                                                 
##  NA's   :65120                                                   
##  2014/15: Numbers accepted as homeless and in priority need total
##  Min.   :    0.0                                                 
##  1st Qu.:   39.0                                                 
##  Median :   87.0                                                 
##  Mean   :  332.9                                                 
##  3rd Qu.:  185.0                                                 
##  Max.   :54430.0                                                 
##  NA's   :65120                                                   
##  2015/16: Numbers accepted as homeless and in priority need total
##  Length:65447                                                    
##  Class :character                                                
##  Mode  :character                                                
##                                                                  
##                                                                  
##                                                                  
##                                                                  
##  2016/17: Numbers accepted as homeless and in priority need total
##  Length:65447                                                    
##  Class :character                                                
##  Mode  :character                                                
##                                                                  
##                                                                  
##                                                                  
##                                                                  
##  2017/18: Numbers accepted as homeless and in priority need total
##  Length:65447                                                    
##  Class :character                                                
##  Mode  :character                                                
##                                                                  
##                                                                  
##                                                                  
## 

I can see that apart from the annoyingly long column names, I seem to have the totals for the whole of England in the first row. So let’s fix these issues:

data_trim <- data_trim %>%
                 slice(-1) %>%
                 set_names("LAA", 2009:2017)

head(data_trim, 20)
## # A tibble: 20 x 10
##    LAA       `2009` `2010` `2011` `2012` `2013` `2014` `2015` `2016` `2017`
##    <chr>      <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>  <chr> 
##  1 Adur          71     90     58     37     10      7 16     31     38    
##  2 Allerdale    102    104     63     41     26     30 32     17     22    
##  3 Amber Va…     30     46     53     61     64    117 101    81     76    
##  4 Arun          52     79    100    129    109    191 228    220    204   
##  5 Ashfield      42     25     16     26     85     87 93     100    123   
##  6 Ashford      178    194    161    199    166    152 154    136    160   
##  7 Aylesbur…     93    112    126    133    116    160 177    161    150   
##  8 Babergh       37     46     78    100     86     86 94     61     72    
##  9 Barking …    232    221    199    664    853    764 941    543    512   
## 10 Barnet       232    251    339    595    674    677 422    640    444   
## 11 Barnsley      95     56     38     23     14     13 14     15     41    
## 12 Barrow-i…     40     26     29     29     19     17 18     11     13    
## 13 Basildon     191    232    255    282    302    351 208    180    202   
## 14 Basingst…      1      1      2     11     22     54 46     107    81    
## 15 Bassetlaw     18     27     48     75     41     91 65     84     77    
## 16 Bath and…     68    100     86     86     65     48 68     86     84    
## 17 Bedford …    141    107    211    242    174    164 287    252    224   
## 18 Bexley       128    204    346    349    420    498 483    508    500   
## 19 Birmingh…   3371   4207   3929   3957   3160   3140 3524   3479   3386  
## 20 Blaby          2      7      2      1      0      6 11     17     32

That’s looking a bit better. I notice that there seems to be a stray “UA” at the end of some LAAs. From the output of the summary() function above, I can also see that the 2015-2017 columns seem to have been parsed as a character, so there’s probably some non-numeric character in there. Let’s see how many places these issues affect:

data_trim %>% filter(str_detect(LAA, " UA")) %>% select(LAA)
## # A tibble: 56 x 1
##    LAA                            
##    <chr>                          
##  1 Bath and North East Somerset UA
##  2 Bedford UA                     
##  3 Blackburn with Darwen UA       
##  4 Blackpool UA                   
##  5 Bournemouth UA                 
##  6 Bracknell Forest UA            
##  7 Brighton and Hove UA           
##  8 Bristol City of UA             
##  9 Central Bedfordshire UA        
## 10 Cheshire East UA               
## # … with 46 more rows
data_trim %>% filter(str_detect(`2015`, "[^0-9]+")) %>% select(LAA, `2015`)
## # A tibble: 5 x 2
##   LAA                `2015`
##   <chr>              <chr> 
## 1 Chorley            -     
## 2 Eden               -     
## 3 Hyndburn           -     
## 4 Isles of Scilly UA -     
## 5 Waverley           -

56 place names ending in “UA” and five places without data in 2015! Let’s update our trimmed data to fix these issues, and make the data tidy by gathering the year headers into their own column:

data_tidy <- data_trim %>%
                mutate(LAA = str_replace(LAA, " UA", "")) %>%
                mutate(`2015` = str_replace(`2015`, "-", NA_character_) %>% as.integer()) %>%
                mutate(`2016` = str_replace(`2016`, "-", NA_character_) %>% as.integer()) %>%
                mutate(`2017` = str_replace(`2017`, "-", NA_character_) %>% as.integer()) %>%
                gather(year, num_homeless, -LAA) %>%
                mutate(year = as.integer(year))

str(data_tidy)
## Classes 'tbl_df', 'tbl' and 'data.frame':    589014 obs. of  3 variables:
##  $ LAA         : chr  "Adur" "Allerdale" "Amber Valley" "Arun" ...
##  $ year        : int  2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 ...
##  $ num_homeless: num  71 102 30 52 42 178 93 37 232 232 ...

Initial analysis

Now I have the data in a more manageable format, let’s quickly plot the top 6 homelessness figures in each year:

data_tidy %>%
  group_by(year) %>%
  arrange(year, desc(num_homeless)) %>% 
  top_n(6) %>%
  ggplot(aes(x = LAA, y = num_homeless)) +
    geom_bar(stat = "identity") +
    coord_flip() +
    facet_wrap(~ year, ncol=2, scales="free_y")
## Selecting by num_homeless

We can see that Birmingham is by far the worst offender. I’m not sure of the accuracy of these figures, but if true that is truly horrifying and it hadn’t seemed to have got any better up to 2017. Which areas have seen the most drastic improvement/deterioration over the 8 years?:

extremes <- data_tidy %>%
                  drop_na() %>%
                  filter(year %in% c(2009, 2017)) %>%
                  group_by(LAA) %>%
                  mutate(homeless2009 = lag(num_homeless),
                         change = num_homeless - homeless2009) %>% 
                  ungroup() %>%
                  drop_na() %>%
                  arrange(change)

bind_rows(head(extremes, 8), tail(extremes, 8))
## # A tibble: 16 x 5
##    LAA                      year num_homeless homeless2009 change
##    <chr>                   <int>        <dbl>        <dbl>  <dbl>
##  1 Sheffield                2017          481          946   -465
##  2 North Tyneside           2017          179          502   -323
##  3 Tower Hamlets            2017          437          690   -253
##  4 Hillingdon               2017          264          452   -188
##  5 Lambeth                  2017          467          625   -158
##  6 Herefordshire County of  2017           53          201   -148
##  7 Gateshead                2017          219          365   -146
##  8 Leeds                    2017          281          427   -146
##  9 Bexley                   2017          500          128    372
## 10 Wandsworth               2017          822          426    396
## 11 Bristol City of          2017          721          285    436
## 12 Kensington and Chelsea   2017          709          255    454
## 13 Enfield                  2017          786          241    545
## 14 Milton Keynes            2017          679           84    595
## 15 Manchester               2017         1222          482    740
## 16 Newham                   2017         1143           97   1046

Sheffield was the most improved with a reduction of 465, with Newham seeing a massive increase of over 1000.

The painful part

So having never done any geospatial analysis or mapping before, I tried doing some Google searches to see if I could find any code I could use. I quickly discovered that if I was going to do any mapping of UK regions, I was going to need to access some shape files.

I managed to download some from the UK Data Service website. I also had enormous trouble getting the function to read the data from within this blog post, but I managed to make it work using the here package, which I’ve since heard good things about on Twitter.

shapes <- st_read(dsn = paste(here::here(),"content/post/data/homelessness/BoundaryData", sep="/"), layer = "infuse_dist_lyr_2011") %>% arrange(name)
## Reading layer `infuse_dist_lyr_2011' from data source `/home/jamie/Documents/R/r-house/content/post/data/homelessness/BoundaryData' using driver `ESRI Shapefile'
## Simple feature collection with 324 features and 5 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 82643.6 ymin: 5333.602 xmax: 655989 ymax: 657599.5
## epsg (SRID):    NA
## proj4string:    +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs
str(shapes)
## Classes 'sf' and 'data.frame':   324 obs. of  6 variables:
##  $ name      : Factor w/ 324 levels "Adur","Allerdale",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ label     : Factor w/ 324 levels "E92000001E06000001",..: 243 64 70 244 195 136 55 220 292 293 ...
##  $ geo_labelw: Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA ...
##  $ geo_label : Factor w/ 324 levels "Adur","Allerdale",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ geo_code  : Factor w/ 324 levels "E06000001","E06000002",..: 243 64 70 244 195 136 55 220 292 293 ...
##  $ geometry  :sfc_MULTIPOLYGON of length 324; first list element: List of 1
##   ..$ :List of 1
##   .. ..$ : num [1:2718, 1:2] 515970 515950 515901 515901 515855 ...
##   ..- attr(*, "class")= chr  "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA
##   ..- attr(*, "names")= chr  "name" "label" "geo_labelw" "geo_label" ...

With the intent of joining my dataframes together, I identified an inconsistency in the areas given in each table (diff() is a very handy function!):

n_distinct(data_tidy$LAA)
## [1] 327
n_distinct(shapes$name)
## [1] 324
data_diff <- setdiff(data_tidy$LAA, shapes$name)
shapes_diff <- setdiff(shapes$name, data_tidy$LAA)

data_frame(data = data_diff,
           shapes = c(shapes_diff,"","",""))
## Warning: `data_frame()` is deprecated, use `tibble()`.
## This warning is displayed once per session.
## # A tibble: 10 x 2
##    data                       shapes                     
##    <chr>                      <chr>                      
##  1 Bristol City of            Bristol, City of           
##  2 City of London             City of London,Westminster 
##  3 Cornwall                   Cornwall,Isles of Scilly   
##  4 Durham                     County Durham              
##  5 Herefordshire County of    Herefordshire, County of   
##  6 Isles of Scilly            Kingston upon Hull, City of
##  7 Kingston upon Hull City of St. Helens                 
##  8 St Helens                  ""                         
##  9 Westminster                ""                         
## 10 <NA>                       ""

You can see from the output above that my homelessness data has split out Westminster from the City of London, and the Isles of Scilly from Cornwall. There are also some grammatical inconsistencies that need to be sorted out. Let’s clean it up, by combining rows

data_final <- data_tidy %>%
              #mutate_at(vars("year", "num_homeless"), as.numeric) %>% 
              mutate(LAA = ifelse(LAA %in% c("City of London","Westminster"),
                                   "City of London,Westminster",
                                   LAA)) %>%
              mutate(LAA = ifelse(LAA %in% c("Cornwall","Isles of Scilly"),
                                   "Cornwall,Isles of Scilly",
                                   LAA)) %>%
              mutate(LAA = ifelse(LAA == "Bristol City of","Bristol, City of",LAA)) %>% 
              mutate(LAA = ifelse(LAA == "Durham","County Durham",LAA)) %>%
              mutate(LAA = ifelse(LAA == "Herefordshire County of","Herefordshire, County of",LAA)) %>%
              mutate(LAA = ifelse(LAA == "Kingston upon Hull City of","Kingston upon Hull, City of",LAA)) %>%
              mutate(LAA = ifelse(LAA == "St Helens","St. Helens",LAA)) %>%
              mutate(LAA = ifelse(LAA == "St. Albans","St Albans",LAA)) %>%
              mutate(LAA = ifelse(LAA == "St. Edmundsbury","St Edmundsbury",LAA)) %>%
              mutate(LAA = as.factor(LAA)) %>%
              group_by(LAA, year) %>% 
              summarise(total_homeless = sum(num_homeless)) %>%
              ungroup()

Next, I created a function to take a year and a set of regions and generate a heatmap. This function filters the homelessness data, joins it with the shape data, and then plots the data. I’ve included regions as an argument so that Birmingham can be filtered out, as it dominates the heatmap.

heatmap <- function(inp_year, regions) {
  
data_joined <- data_final %>%
                  filter(year==inp_year) %>%
                  filter(LAA %in% regions) %>%
                  right_join(shapes, by = c("LAA"="name"))

max_scale <- max(data_final %>%
                  filter(LAA %in% regions) %>%
                  select(total_homeless), na.rm=TRUE)

  p <- ggplot() +
  geom_sf(data=data_joined, aes(fill=total_homeless), col="black") +
    theme_void() + coord_sf(datum=NA) + 
    scale_fill_viridis_c(name = NULL, option = "magma",
                         limits = c(0, max_scale),
                         breaks = c(0, max_scale/2, max_scale)) +
    labs(title = paste0("Total number of people accepted as homeless and in priority need in England in ",inp_year),
       caption = "Data obtained from  http://opendata.cambridgeshireinsight.org.uk/dataset/homelessness-england")
  print(p)
}

regions_to_include <- unique(setdiff(data_final$LAA, "Birmingham"))

save_gif(walk(min(data_final$year):max(data_final$year), heatmap, regions = regions_to_include), 
         delay = 0.7, gif_file = "animation.gif")
Homelessness heatmap

Homelessness heatmap

I certainly feel this project has been a bit of a hack job. It’s taken me over a month to write because it’s been so challenging and I’ve had to leave and come back to it so many times. I’m not proud of it, mainly because I rushed it at the end because I just wanted it done.

I’ve since used Tableau, and that seems a bit easier to do heatmaps. If I were to do it again in R however, I think I’ll be taking the courses on DataCamp first!

comments powered by Disqus