Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Open Data Quality Dashboard – NHTG14
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Dan Palmer
March 09, 2014
Technology
0
200
Open Data Quality Dashboard – NHTG14
Presentation of our (@ElliotJH, @danpalmer) hack at National Hack the Government 2014.
Dan Palmer
March 09, 2014
Tweet
Share
More Decks by Dan Palmer
See All by Dan Palmer
Scaling Django Codebases
danpalmer
1
110
Other Decks in Technology
See All in Technology
Context Engineeringが企業で不可欠になる理由
hirosatogamo
PRO
3
620
22nd ACRi Webinar - NTT Kawahara-san's slide
nao_sumikawa
0
100
制約が導く迷わない設計 〜 信頼性と運用性を両立するマイナンバー管理システムの実践 〜
bwkw
3
960
Ruby版 JSXのRuxが気になる
sansantech
PRO
0
160
Context Engineeringの取り組み
nutslove
0
360
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
3k
インフラエンジニア必見!Kubernetesを用いたクラウドネイティブ設計ポイント大全
daitak
1
370
StrandsとNeptuneを使ってナレッジグラフを構築する
yakumo
1
120
茨城の思い出を振り返る ~CDKのセキュリティを添えて~ / 20260201 Mitsutoshi Matsuo
shift_evolve
PRO
1
330
Red Hat OpenStack Services on OpenShift
tamemiya
0
120
会社紹介資料 / Sansan Company Profile
sansan33
PRO
15
400k
OWASP Top 10:2025 リリースと 少しの日本語化にまつわる裏話
okdt
PRO
3
810
Featured
See All Featured
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
1
130
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
86
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
How to build a perfect <img>
jonoalderson
1
4.9k
Facilitating Awesome Meetings
lara
57
6.8k
KATA
mclloyd
PRO
34
15k
How to Ace a Technical Interview
jacobian
281
24k
My Coaching Mixtape
mlcsv
0
48
Unsuck your backbone
ammeep
671
58k
Code Review Best Practice
trishagee
74
20k
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Music & Morning Musume
bryan
47
7.1k
Transcript
Government Open Data Quality Dashboard National Hack the Government 2014
Elliot Hughes @ElliotJH Dan Palmer @danpalmer “That Southampton Lot” You
may remember us from such hacks as Greedy MPs, Hillsborough Unlocked,! Rate Your Member, Medical Now,! Insulate Me, One Nation Under CCTV
Why do we need to do this?
Climate_change_and_ transport_choices.sav
Climate_change_and_ transport_choices.sav
Climate_change_and_ transport_choices.wtf
Climate_change_and_ transport_choices.wtf “SPSS is used for statistical analysis, initially released
in 1968”
Climate_change_and_ transport_choices.wtf “SPSS is used for statistical analysis, initially released
in 1968” “…can only be used on the platform that created the file…”
viewfile.ashx
CloudStore - May 2012 cat export (comma delimited - text
string comma escaped with backslash - three header rows).csv
coreaccessindicators2008.6
Possible Uses
Possible Uses • Government File Name or Geocities website game
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done.
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done. • Twitter bot
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done. • Twitter bot • Open Data Quality Metrics
1. Get all the data from data.gov.uk for all Ministerial
Departments.
2. Get all the data from gov.uk for all Ministerial
Departments.
2. Get all the data from gov.uk for all Ministerial
Departments. They don’t have an API for that.
3. Validate the data
4. LEADERBOARDS!
None
35%
*.csv 35%
None
40%
https 40%
None
9%
Not hosted on *.gov.uk 9%
None
16%
index.htm index.php http://somewhere.gov.uk/ 16%
None
8.5%
Unreachable 8.5%
Total indirectly linked or unreachable 25%
test_results_2010.txt.gz
test_results_2010.txt.gz “Anonymised MOT tests and results”
test_results_2010.txt.gz “Anonymised MOT tests and results” 750MB
test_results_2010.txt.gz “Anonymised MOT tests and results” 750MB 3.3GB uncompressed
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01!
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01! WHAT FORMAT IS THIS?
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01! WHAT FORMAT IS THIS? wtf? wtf? wtf? wtf? wtf? wtf?
HEAD /file.csv
HEAD /file.csv 405 Method Not Supported 405 Method Not Supported
405 Method Not Supported 500 Internal Server Error 500 Internal Server Error
How do we want data? • Atom feeds - 1
• CSV - 35% • JSON - 1 • RDF - 34 • Valid • UTF-8 or ASCII
Demo
None
Better than it’s ever been.
Better than it’s ever been.
Better than it’s ever been. Still a long way to
go…