Watch Out What's About?

Project Info

Data @ Heart thumbnail

Team Members


1 member with an unpublished profile.

Project Description


  • Our project aims to be a service to Australians living their lives, telling their stories through social posts to help people to be more aware of the issues facing themselves. It aims at improving the safety and peace of mind of people in and about our great, vibrant urban, regional and open spaces.

  • Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process these for NLP semantic and entity understanding, in order to predict and offer insight to users of our app. The information of trend predictions of crime in locations around where they are, helps better inform them of the risks that might be experienced. This is particularly important given the #MeToo movement and tragic circumstances experienced by the likes of Jill Meagher and Eurydice Dixon, as well the alcohol related “king-hit” punch crimes that ended in the tragic death of Daniel Christie.

  • With our passion for data at the heart of decision making, we hope to make Australian society a better place to live, love and enjoy.

  • This too, would help utilise police resources patrolling areas of predicted high crime areas based on our modelling calculations in a simple to use easy to access app on a smartphone. Our layers of geolocation mapping also blends and calculates the location of pubs and restaurants to weight the variables of incidence of crime, due to such factors of alcohol consumption. The data from AIHW has added the ability to incorporate trends and predictions from consumption of alcohol (particularly the increase in consumption of wine in pubs and restaurants) and aims to use neural network image recognition of socal (e.g.: Instagram) images into the future to dynamically model risk factors around the app user.

  • The data that we have brought together is difficult to join and blend in a meaningful easy to use way, but by starting at the audience benefit of safety information, spatial awareness and the ability to make data aware choices it betters their ability to enjoy their lives, with knowledge of past and predicted future safety values in and around where they are living their lives, socialising or commuting.

  • Data @ Heart team has particularly enjoyed using our knowledge to make this blended weighted decision tool for all members our community.

https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/CrimeByPostcode1995-2017WithSocialPostsIncidences?publish=yes

https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/CrimeByPostcode1995-2017Categories?publish=yes

https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/SocialPostsofCommunityReportsPotentialWarnings?publish=yes


Data Story


  • Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process these for NLP semantic and entity understanding, in order to predict and offer insight to users of our app. The crime is split into 17 major categories which would work as user personalisation filters to tailor the experience to individual concerns.

  • We first gathered NSW police crime reports into a corpus of text data to train topic vectors which are then matched in closeness values to identify suitable relevant social posts, that are geolocation enabled, mapped in the area that you are currently in.

  • The information of trend predictions of crime in locations around where they are, helps better inform them of the risks that might be experienced. This is particularly important given the #MeToo movement and tragic circumstances experienced by the likes of Jill Meagher and Eurydice Dixon, as well the alcohol related “king-hit” punch crimes that ended in the tragic death of Daniel Christie.

  • This too, would help utilise police resources patrolling areas of predicted high crime areas based on our modelling calculations in a simple to use easy to access app on a smartphone. Our layers of geolocation mapping also blends and calculates the location of pubs and restaurants to weight the variables of incidence of crime, due to such factors of alcohol consumption. The data from AIHW has added the ability to incorporate trends and predictions from consumption of alcohol (particularly the increase in consumption of wine in pubs and restaurants) and aims to use neural network image recognition of socal (e.g.: Instagram) images into the future to dynamically model risk factors around the app user.

  • We identified a particular growth recently in the occurrence of transport related crimes in the 22 years of BOCSAR data and so would plan on building into the distance weighting of risk value to the users current location, further factorisation of key train stations and transport hubs where crime incidents have occurred.

  • The data that we have brought together is difficult to join and blend in a meaningful easy to use way, but by starting at the audience benefit of safety information, spatial awareness and the ability to make data aware choices it betters their ability to enjoy their lives, with knowledge of past and predicted future safety values in and around where they are living their lives, socialising or commuting.

  • Data @ Heart team has particularly enjoyed using our knowledge to make this blended weighted decision tool for all members our community.

Structural Topic Model Project - Crime Data

Sarah Fawcett + Tony Nguy

07/09/2018

library(stm)
library(igraph)
library(stmCorrViz)
library(tidyverse)
library(dplyr)
library(stringr)
library(tidytext)
library(car)
library(reshape2)
library(lubridate)
library(ggpmisc)

Set working directory

setwd("~/Documents/DATA-SCIENCE/GOVHACK/DATA")

Clean Out Old Objects

rm(list = ls())

rm(Crime9517_wide)

1: INGEST (PROTOTYPE)

Crime9517wide <- read.csv("CrimePostcodeData1995_2017.csv", header=T)

Convert to Long Format

Crime9517long <- melt(Crime9517wide, id=c("Postcode","Offence.category","Subcategory"))

Crime9517longdate <- melt(Crime9517_wide, id=c("Postcode","Offence.category","Subcategory"))

2: PREDICT

Test of factor

class(Crime9517_long$variable)

Convert Data to Date If Necessary

mdy(Crime9517_long$variable)

Crime9517long$variable <- as.Date(Crime9517long$variable)

Crime9517long$variable <- as.factor(Crime9517long$variable)

Convert Postcode to Character

class(Crime9517_long$Postcode)

Crime9517long$Postcode <- as.numeric(Crime9517long$Subcategory)

Crime9517long$Postcode <- as.numeric(Crime9517long$Postcode)

Crime9517long$value <- as.numeric(Crime9517long$value)

Generalised Linear Regression Model

attach(Crime9517_long)

model.glm.crime <- glm(Crime9517long$Postcode ~ Crime9517long$value + Crime9517long$Subcategory + Crime9517long$variable)

Summary

model.lm.internet

Pedict

predict.glm(model.glm.crime, data.frame(value=30, Subcategory=71, variable=21017, type="response", interval="confidence"))

detach(Crime9517_long)

WRITE

write.csv(Crime9517longdate, file = "~/Documents/DATA-SCIENCE/GOVHACK/DATA/Crime9517longdate.csv", row.names=FALSE)


Evidence of Work

Video

Homepage

Team DataSets

NSW State Police Crime Reports

Description of Use: Used corpus of text from police crime reports to train topic vectors with a closeness matching index to publicly available social posts ignorer to overlay those posts that are suitable and relevant onto our risk assessment mapping app.

Data Set

Social Data Scraped from Twitter, Facebook and Instagram

Description of Use: Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process the "Tweet_Body" for NLP semantic and entity understanding, in order to predict and offer insight to users of our app.

Data Set

Australian Institute of Health and Welfare Consumption of Alcohol

Description of Use: The data from AIHW has added the ability to incorporate trends and predictions from consumption of alcohol (particularly the increase in consumption of wine in pubs and restaurants)

Data Set

Australian Bureau of Statistics (ABS) Counts of Australian Businesses

Description of Use: Our layers of geolocation mapping also blends and calculates the location of pubs and restaurants to weight the variables of incidence of crime, due to such factors of alcohol consumption.

Data Set

Bureau of Crime Statistics and Research (BOCSAR) Monthly data on all criminal incidents recorded by police

Description of Use: Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process these for NLP semantic and entity understanding, in order to predict and offer insight to users of our app. The information of trend predictions of crime in locations around where they are, helps better inform them of the risks that might be experienced.

Data Set

Challenges

Bounty: Mix and Mashup

How can we combine the uncombinable?

Go to Challenge | 61 teams have entered this challenge.

Spatial data challenge

How can spatial data be leveraged to provide the best community outcome? How can this mapping data be used to deliver value to the people of NSW?

Go to Challenge | 14 teams have entered this challenge.

Australians' stories

What meaningful ways can we tell the story about what it's like to be an Australian, and in what ways some Australians live very different lives than others? How can we make people more aware of the issues facing themselves and others as they go through life?

Go to Challenge | 34 teams have entered this challenge.

Data4Good

How can open data be used to make a social impact, contributing to the betterment of society? How can we improve prospects for children, and education, using open data? What sort of impact can be made on homelessness, mental health outcomes, or the environment, using open data?

Go to Challenge | 19 teams have entered this challenge.