$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
OCRFeeder: OCR Made Easy on GNOME
Search
Joaquim Rocha
July 27, 2012
Programming
1
320
OCRFeeder: OCR Made Easy on GNOME
A presentation of what OCRFeeder is and what is does.
Joaquim Rocha
July 27, 2012
Tweet
Share
More Decks by Joaquim Rocha
See All by Joaquim Rocha
Git: Best Practices
jrocha
3
3.8k
Skeltrack: Open Source Skeleton Tracking
jrocha
1
200
Introduction to Django
jrocha
5
3.8k
Skeltrack - Open Source Skeleton Tracking
jrocha
0
180
Skeltrack: Open Source Skeleton Tracking
jrocha
1
3.1k
Other Decks in Programming
See All in Programming
AIコーディングエージェント(Gemini)
kondai24
0
280
Graviton と Nitro と私
maroon1st
0
140
Context is King? 〜Verifiability時代とコンテキスト設計 / Beyond "Context is King"
rkaga
10
1.4k
実は歴史的なアップデートだと思う AWS Interconnect - multicloud
maroon1st
0
260
Canon EOS R50 V と R5 Mark II 購入でみえてきた最近のデジイチ VR180 事情、そして VR180 静止画に活路を見出すまで
karad
0
140
Pythonではじめるオープンデータ分析〜書籍の紹介と書籍で紹介しきれなかった事例の紹介〜
welliving
3
600
開発に寄りそう自動テストの実現
goyoki
2
1.4k
TerraformとStrands AgentsでAmazon Bedrock AgentCoreのSSO認証付きエージェントを量産しよう!
neruneruo
4
1.8k
The Art of Re-Architecture - Droidcon India 2025
siddroid
0
130
AIの誤りが許されない業務システムにおいて“信頼されるAI” を目指す / building-trusted-ai-systems
yuya4
6
4k
AIエンジニアリングのご紹介 / Introduction to AI Engineering
rkaga
8
3.3k
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
120
Featured
See All Featured
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
150
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
51
45k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
130
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
130
RailsConf 2023
tenderlove
30
1.3k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Unsuck your backbone
ammeep
671
58k
Agile that works and the tools we love
rasmusluckow
331
21k
Evolving SEO for Evolving Search Engines
ryanjones
0
73
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
190
Transcript
static void _f_do_barnacle_install_properties(GObjectClass *gobject_class) { GParamSpec *pspec; /* Party code
attribute */ pspec = g_param_spec_uint64 (F_DO_BARNACLE_CODE, "Barnacle code.", "Barnacle code", 0, G_MAXUINT64, G_MAXUINT64 /* default value */, G_PARAM_READABLE | G_PARAM_WRITABLE | G_PARAM_PRIVATE); g_object_class_install_property (gobject_class, F_DO_BARNACLE_PROP_CODE, Joaquim Rocha
[email protected]
OCRFeeder OCR Made Easy on GNOME July 27 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 What is
it? Document Analysis and Optical Character Recognition for GNOME
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Why? Paper
has a number of problems No applications for GNU/Linux to do a fair job
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Paper problems:
Security CC Photo by: http://www.flickr.com/photos/badwsky/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Paper problems:
Preservation CC Photo by: http://www.flickr.com/photos/98469445@N00/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Paper problems:
Data processing CC Photo by: http://www.flickr.com/photos/hugovk/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Paper problems:
Ecology CC Photo by: http://www.flickr.com/photos/pranavsingh/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Paper problems:
Accessibility CC Photo by: http://www.flickr.com/photos/illustrator/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 No fair
conversion apps for GNU/Linux apart from OCR engines, but...
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 OCR !=
Document Conversion (it only deals with chars) (does not consider the layout) (does not distinguish contents)
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 What's needed
is Document Analysis and Recognition (conversion of documents to an electronic format) (first projects in the 80s)
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 How it
works
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 So many
layouts... CC Photo by: http://www.flickr.com/photos/uber-tuber/
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Layouts vary
with the type of document What works on detecting one, won't work on others
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 OCRFeeder focuses
on contents, not on layouts!
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Key concept:
If a document image can be divided in windows of 1 (content) or 0 (not content), then it is possible to group all the 1s and outline the contents
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Recognition: System-wide
OCR engines are used Engines are configured from the GUI or XML files
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Most known
free OCR engines are detected and configured automatically: * Tesseract * GOCR * OCRAD * Cuneiform
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Exportation formats:
ODT HTML Plain text PDF
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 User interaction:
Users can edit everything and review the algorithm's results So, UI can work in attended and unattended ways CLI only works in an unattended mode
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Demo time!
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Other features:
* PDF importation * Unpaper preprocessor * Font style edition * Image deskewing * OCR results cleaning * Project saving/loading
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Future: *
More exportation formats: HOCR, etc. * Make OCR engines' management easier
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Webpage: http://live.gnome.org/OCRFeeder
git: http://git.gnome.org/ocrfeeder Bugzilla: http://bugzilla.gnome.org product: OCRFeeder
Joaquim Rocha (Igalia) · OCRFeeder · GUADEC 2012 Thank you!