#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    279
    Rep Power
    9

    How to develop a OCR application that can scan documents online?


    How to develop a OCR application that can scan documents online?
  2. #2
  3. Periodically energetic Perler
    Devshed Regular (2000 - 2499 posts)

    Join Date
    May 2005
    Location
    Dublin, Ireland
    Posts
    2,265
    Rep Power
    538
    Originally Posted by ron.ron
    How to develop a OCR application that can scan documents online?
    Do you mean something like this

    Displeaser
    Vi Veri Veniversum Vivus Vici.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    279
    Rep Power
    9
    yes i mean something like this
    how should i go about working on something like this?
  6. #4
  7. Periodically energetic Perler
    Devshed Regular (2000 - 2499 posts)

    Join Date
    May 2005
    Location
    Dublin, Ireland
    Posts
    2,265
    Rep Power
    538
    Originally Posted by ron.ron
    yes i mean something like this
    how should i go about working on something like this?
    Afraid I've never used it so wouldnt know

    The SDK help will go through the technical explanations of each of the APIs functions and there should be some example code either in one of the downloads or on the site. If you cant find any then drop their help a mail and they should be able to give you more assistance.

    Displeaser
    Vi Veri Veniversum Vivus Vici.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    279
    Rep Power
    9
    so u mean the sdk ..i can just use the php or watever language to call on the packages?
  10. #6
  11. Periodically energetic Perler
    Devshed Regular (2000 - 2499 posts)

    Join Date
    May 2005
    Location
    Dublin, Ireland
    Posts
    2,265
    Rep Power
    538
    Originally Posted by ron.ron
    so u mean the sdk ..i can just use the php or watever language to call on the packages?
    I had a look into and it seems its only for turning paper documents or tiffs into electronic format. Is that what you are planning on doing?

    For programming and usage it seems to only provide a C++ and an activeX wrapper.

    Displeaser
    Vi Veri Veniversum Vivus Vici.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    279
    Rep Power
    9
    hmm i c..so is there anyway i can use php to call it..anyone have any idea?
  14. #8
  15. Periodically energetic Perler
    Devshed Regular (2000 - 2499 posts)

    Join Date
    May 2005
    Location
    Dublin, Ireland
    Posts
    2,265
    Rep Power
    538
    Originally Posted by ron.ron
    hmm i c..so is there anyway i can use php to call it..anyone have any idea?
    what exactly are you trying to do?

    If its parse html/xml pages then Perl or another similar language will do that easily for you. If your trying to parse the text from gifs/jps etc then you can use the above sdk (after you convert them to tiffs).

    Give us a bit more info to work with.

    Displeaser
    Vi Veri Veniversum Vivus Vici.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Posts
    1
    Rep Power
    0

    Wink


    You can use ocrsdk.com for ocr development of your application. See documentation section for details.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    London
    Posts
    40
    Rep Power
    15
    Originally Posted by ron.ron
    How to develop a OCR application that can scan documents online?
    There are few parts

    * Web application that allows users to upload image (create a cookie with unique ID)
    * Images are stored in a "processing" folder (possibly marked by a unique ID)
    * A separate service application on the server that takes batches of Images from the "processing" folder.
    * Service application OCR's an image file and then places the data into a database (with the ID). Original File is deleted.

    * Web application polls database with id from Cookie.
    * When data becomes available, result is rendered as HTML. Cookie is deleted, data result record is deleted.

    In order to create the service application you will have two options:

    * Use an existing OCR library.
    * Create your own OCR engine.

    I hope that helps.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    Haifa, Israel
    Posts
    17
    Rep Power
    0
    ron.ron asked:
    How to develop a OCR application that can scan documents online?

    I came across this very old discussion on a search, and Darknite has given a good list of the elements in the solution. However, one point that wasn't discussed, and I think is becoming more and more relevant today, is about which types of documents are being scanned - structured or unstructured sources. Here's what I mean:
    - "Unstructured" documents are unique and random, for example hand-written faxes to a support center.
    - "Structured" documents, have a standard layout, e.g. forms filled out by customers (where the fields are always in the same place), various cards (such as business cards, ID cards, with predefined data such as name, ID number etc).

    I have seen many cases of web applications that are required to process structured documents, for example, a banking application that receives mortgage forms and supporting documents, and car rental companies that allow customers to scan and upload driver's licenses online or through mobile devices.

    Since this original discussion there's been quite an advancement in technology that can process structured documents. These tools can detect and OCR only the specific text fields in the image, clean out "noise", and output the text in a structured manner (e.g. Name = John Smith, ID Number = 123456). Here are a few solutions I'm familiar with that do this in the context of web applications:

    - Kofax web-capture - a web-based OCR app that can take in any type of structured form and process it, as far as I know it requires some "training" as to the data in the forms. They provide not only the OCR engine but also a full web interface for uploading and manipulating the scanned files - this interface can be embedded into a web app using Java or .NET.

    - CSSN - they provide an engine that can take card-type documents like ID cards, driver's licenses, medical cards, bank checks, credit cards, etc, and extract the data to a text file. It's possible to integrate the engine with a web application using the SDK.

    - Top Image Systems - their eFLOW product does data capture from healthcare forms, driverís license applications, tax forms, multiple-choice examinations, insurance claims, or any other structured forms. Also enables integrating with existing systems including browser-based interfaces. Here

    HTH

IMN logo majestic logo threadwatch logo seochat tools logo