Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OCR with Go

OCR with Go

Lightning Talk feita na GopherCon Brasil 2018 falando um pouco sobre um side project para extrair informações da Nota Fiscal Paulistana usando OCR e expondo os dados em uma API escrita em Go.

Images:
- Tesseract [ http://marvelcinematicuniverse.wikia.com ]
- Thanos [ http://hdqwalls.com ]
- NF de exemplo [ https://br.pinterest.com/pin/161496336618265839/?lp=true ]

Marco Antônio Singer

September 29, 2018
Tweet

More Decks by Marco Antônio Singer

Other Decks in Technology

Transcript

  1. Optical character recognition Optical character recognition (also optical character reader,

    OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast) - Wikipedia
  2. package main import ( "fmt" "github.com/otiai10/gosseract" ) func main() {

    client := gosseract.NewClient() defer client.Close() client.SetImage("path/to/image.png") text, _ := client.Text() fmt.Println(text) // Hello, World! }
  3. { "source": "path/to/document", "labels": [ { "name": "Número da Nota",

    "x": 1900, "y": 260, "h": 30, "w": 450 }, { "name": "Razão Social", "x": 250, "y": 590, "h": 45, "w": 1040 }, ] } payload.json
  4. func DetectDocumentText(source string) (*vtypes.TextAnnotation, error) { ctx := context.Background() client,

    _ := vision.NewImageAnnotatorClient(ctx) defer client.Close() return client.DetectDocumentText( ctx, vision.NewImageFromURI(source), nil, ) }
  5. func DetectDocumentText(source string) (*vtypes.TextAnnotation, error) { ctx := context.Background() client,

    _ := vision.NewImageAnnotatorClient(ctx) defer client.Close() return client.DetectDocumentText( ctx, vision.NewImageFromURI(source), nil, ) }
  6. for _, page := range ta.Pages { for _, block

    := range page.Blocks { for _, label := range labels { for _, paragraph := range block.Paragraphs { for _, words := range paragraph.Words { for _, symbols := range w.Symbols {
  7. for _, page := range ta.Pages { for _, block

    := range page.Blocks { for _, label := range labels { for _, paragraph := range block.Paragraphs { for _, words := range paragraph.Words { for _, symbols := range w.Symbols {
  8. { "source": "path/to/document", "labels": [ { "name": "Número da Nota",

    "x": 1900, "y": 260, "h": 30, "w": 450, "text": "00000086", "confidence": 0.99, }, { "name": "Razão Social", "x": 250, "y": 590, "h": 45, "w": 1040, "text": "PayPal do Brasil Serviços de Pagamento LTDA", "confidence": 0.97, }, ] }
  9. { "source": "path/to/document", "labels": [ { "name": "Número da Nota",

    "x": 1900, "y": 260, "h": 30, "w": 450, "text": "00000086", "confidence": 0.99, }, { "name": "Razão Social", "x": 250, "y": 590, "h": 45, "w": 1040, "text": "PayPal do Brasil Serviços de Pagamento LTDA", "confidence": 0.97, }, ] }
  10. { "source": "path/to/document", "labels": [ { "name": "Número da Nota",

    "x": 1900, "y": 260, "h": 30, "w": 450 }, { "name": "Razão Social", "x": 250, "y": 590, "h": 45, "w": 1040 }, ] } payload.json
  11. { "source": "path/to/document", "labels": [ { "name": "Número da Nota",

    "x": 1900, "y": 260, "h": 30, "w": 450, "text": "00000086", "confidence": 0.99, }, { "name": "Razão Social", "x": 250, "y": 590, "h": 45, "w": 1040, "text": "PayPal do Brasil Serviços de Pagamento LTDA", "confidence": 0.97, }, ] } curl -d @payload.json http://localhost/detect
  12. Goals • Accept other formats (e.g. PDFs) • Create new

    templates • Scale support • Extract data as table (e.g. product's list)