Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Open Data Quality Dashboard – NHTG14
Search
Dan Palmer
March 09, 2014
Technology
0
190
Open Data Quality Dashboard – NHTG14
Presentation of our (@ElliotJH, @danpalmer) hack at National Hack the Government 2014.
Dan Palmer
March 09, 2014
Tweet
Share
More Decks by Dan Palmer
See All by Dan Palmer
Scaling Django Codebases
danpalmer
1
110
Other Decks in Technology
See All in Technology
7月のガバクラ利用料が高かったので調べてみた
techniczna
3
830
iPhone Eye Tracking機能から学ぶやさしいアクセシビリティ
fujiyamaorange
0
530
生成AI時代のデータ基盤設計〜ペースレイヤリングで実現する高速開発と持続性〜 / Levtech Meetup_Session_2
sansan_randd
1
120
AI時代にPdMとPMMはどう連携すべきか / PdM–PMM-collaboration-in-AI-era
rakus_dev
0
270
AIのグローバルトレンド2025 #scrummikawa / global ai trend
kyonmm
PRO
1
210
大「個人開発サービス」時代に僕たちはどう生きるか
sotarok
18
8.5k
Kubernetes における cgroup v2 でのOut-Of-Memory 問題の解決
pfn
PRO
0
460
開発者を支える Internal Developer Portal のイマとコレカラ / To-day and To-morrow of Internal Developer Portals: Supporting Developers
aoto
PRO
1
100
JuniorからSeniorまで: DevOpsエンジニアの成長ロードマップ
yuriemori
2
360
衝突して強くなる! BLUE GIANTと アジャイルチームの共通点とは ― いきいきと活気に満ちたグルーヴあるチームを作るコツ ― / BLUE GIANT and Agile Teams
naitosatoshi
0
300
Webブラウザ向け動画配信プレイヤーの 大規模リプレイスから得た知見と学び
yud0uhu
0
200
【初心者向け】ローカルLLMの色々な動かし方まとめ
aratako
7
3k
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
Statistics for Hackers
jakevdp
799
220k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
8
510
Docker and Python
trallard
45
3.5k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
Rails Girls Zürich Keynote
gr2m
95
14k
Site-Speed That Sticks
csswizardry
10
810
GitHub's CSS Performance
jonrohan
1032
460k
Transcript
Government Open Data Quality Dashboard National Hack the Government 2014
Elliot Hughes @ElliotJH Dan Palmer @danpalmer “That Southampton Lot” You
may remember us from such hacks as Greedy MPs, Hillsborough Unlocked,! Rate Your Member, Medical Now,! Insulate Me, One Nation Under CCTV
Why do we need to do this?
Climate_change_and_ transport_choices.sav
Climate_change_and_ transport_choices.sav
Climate_change_and_ transport_choices.wtf
Climate_change_and_ transport_choices.wtf “SPSS is used for statistical analysis, initially released
in 1968”
Climate_change_and_ transport_choices.wtf “SPSS is used for statistical analysis, initially released
in 1968” “…can only be used on the platform that created the file…”
viewfile.ashx
CloudStore - May 2012 cat export (comma delimited - text
string comma escaped with backslash - three header rows).csv
coreaccessindicators2008.6
Possible Uses
Possible Uses • Government File Name or Geocities website game
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done.
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done. • Twitter bot
Possible Uses • Government File Name or Geocities website game
• Flashcard or Top-Trumps “my data is better than your data” game • This has probably already been done. • Twitter bot • Open Data Quality Metrics
1. Get all the data from data.gov.uk for all Ministerial
Departments.
2. Get all the data from gov.uk for all Ministerial
Departments.
2. Get all the data from gov.uk for all Ministerial
Departments. They don’t have an API for that.
3. Validate the data
4. LEADERBOARDS!
None
35%
*.csv 35%
None
40%
https 40%
None
9%
Not hosted on *.gov.uk 9%
None
16%
index.htm index.php http://somewhere.gov.uk/ 16%
None
8.5%
Unreachable 8.5%
Total indirectly linked or unreachable 25%
test_results_2010.txt.gz
test_results_2010.txt.gz “Anonymised MOT tests and results”
test_results_2010.txt.gz “Anonymised MOT tests and results” 750MB
test_results_2010.txt.gz “Anonymised MOT tests and results” 750MB 3.3GB uncompressed
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01!
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01! WHAT FORMAT IS THIS?
1044521|362808|2010-07-12|2|N|P|34414|TN|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1200|1948-01-01! 1044532|362815|2010-05-15|4|N|P|61618|LE|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|0|1998-08-01! 1044546|362824|2010-09-03|4|PL|P|54070|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044547|362824|2010-09-01|4|N|F|54065|SO|UNCLASSIFIED|UNCLASSIFIED|RED|D|1870|1997-04-01! 1044580|362842|2010-01-13|4|N|F|91834|SA|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2380|1998-11-30! 1044591|362851|2010-02-27|1|N|F|0|PE|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|125|1970-01-01! 1044592|362852|2010-09-03|4|N|PRS|53897|SL|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1597|1998-03-01! 1044595|362853|2010-09-03|2|N|F|7764|ST|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1200|2005-09-04! 1044620|362872|2010-09-08|4|N|F|82997|GL|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|1275|1997-01-01! 1044653|297603|2010-03-10|1|N|PRS|779|WF|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|150|1962-01-01!
1044684|362918|2010-03-15|4|N|PRS|34169|CV|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1248|2004-07-02! 1044692|362923|2010-09-06|4|N|F|70286|S|UNCLASSIFIED|UNCLASSIFIED|GREY|D|2490|1997-04-01! 1044770|362974|2010-10-14|2|N|PRS|18440|LA|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|350|1996-11-28! 1044797|287632|2010-06-29|1|N|F|23426|BH|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|149|1958-01-01! 1044819|363009|2010-03-04|4|N|F|145955|RH|UNCLASSIFIED|UNCLASSIFIED|RED|P|1215|1998-10-27! 1044822|363011|2010-08-05|4|N|F|123795|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1987-06-18! 1044828|363015|2010-04-15|4|N|P|119891|LE|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2000|1996-05-01! 1044897|363063|2010-12-16|4|N|F|97105|RG|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|1998|1988-01-01! 1044911|363072|2010-05-12|4|N|F|111694|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|2500|1999-01-01! 1044937|362747|2010-01-21|5|N|F|84748|DE|UNCLASSIFIED|UNCLASSIFIED|WHITE|D|2402|2001-09-01! 1044951|363101|2010-03-30|4|N|F|98311|NP|UNCLASSIFIED|UNCLASSIFIED|SILVER|P|2000|1999-11-30! 1044954|363103|2010-05-01|2|N|F|0|BH|UNCLASSIFIED|UNCLASSIFIED|RED|P|25|1999-11-11! 1044955|363104|2010-03-02|4|N|P|185807|N|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|1596|2004-04-30! 1044966|281511|2010-02-06|1|N|F|194|LL|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|125|1959-06-01! 1045025|363159|2010-07-20|4|N|F|61718|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|3200|1999-01-01! 1045042|363170|2010-03-04|4|N|F|90210|BS|UNCLASSIFIED|UNCLASSIFIED|GREEN|P|0|1991-07-01! 1045053|363179|2010-07-30|1|N|P|1|CT|UNCLASSIFIED|UNCLASSIFIED|WHITE|P|125|1964-01-01! 1045113|363222|2010-02-06|1|N|P|8079|S|UNCLASSIFIED|UNCLASSIFIED|CREAM|P|150|1968-01-01! 1045122|363228|2010-06-07|4|N|ABR|0|GU|UNCLASSIFIED|UNCLASSIFIED|BLUE|D|1995|2006-05-29! 1045172|363259|2010-09-27|4|N|PRS|77777|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|2495|1996-04-04! 1045197|363277|2010-04-19|4|F|P|178744|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045198|363277|2010-03-22|4|N|F|178741|NW|UNCLASSIFIED|UNCLASSIFIED|BLACK|P|1799|1999-02-12! 1045253|363319|2010-06-03|4|N|F|1365|NG|UNCLASSIFIED|UNCLASSIFIED|GREEN|D|0|1986-06-26! 1045281|363339|2010-08-27|4|N|F|26477|NG|UNCLASSIFIED|UNCLASSIFIED|RED|P|4600|2006-03-01! 1045315|363367|2010-02-23|4|N|F|56355|WA|UNCLASSIFIED|UNCLASSIFIED|SILVER|D|2200|1997-08-01! WHAT FORMAT IS THIS? wtf? wtf? wtf? wtf? wtf? wtf?
HEAD /file.csv
HEAD /file.csv 405 Method Not Supported 405 Method Not Supported
405 Method Not Supported 500 Internal Server Error 500 Internal Server Error
How do we want data? • Atom feeds - 1
• CSV - 35% • JSON - 1 • RDF - 34 • Valid • UTF-8 or ASCII
Demo
None
Better than it’s ever been.
Better than it’s ever been.
Better than it’s ever been. Still a long way to
go…