flat assembler
Message board for the users of flat assembler.

Index > Windows > grab screen data, ocr parse them and store into database

Author
Thread Post new topic Reply to topic
sleepsleep



Joined: 05 Oct 2006
Posts: 12937
Location: ˛                             ⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣Posts: 0010456
sleepsleep 25 May 2016, 22:19
hi all,

under windows os or directx 32/64 api,

what are the recommended methods to grab screen data, parsing (assume those are images) continuously for 6 hours daily?

assume there are about 80% of the whole screen 1920 x 1080 pixels we need to process every second or two seconds.

any ideas?

* something cool, https://github.com/tesseract-ocr/tesseract
might need to test how many small rectangle and how many count of png file it could process in a second.

* now how to grab screen in memory and pass them to tesseract

* using office sharepoint 2007 (free) MODI to ocr from tiff
https://community.spiceworks.com/how_to/30155-how-to-deploy-modi-automatically-without-ms-office-licenses-or-media
Post 25 May 2016, 22:19
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20430
Location: In your JS exploiting you and your system
revolution 26 May 2016, 08:43
Why are you grabbing screen data? Rather than going through all the steps to transfer the images to screen and back again into main memory, why not just do it directly without the intermediate stage?

Why are you OCRing? Rather than rendering the characters to pixels and back again into characters, why not just take the data directly without the intermediate stage?

Why are you storing in a database? Rather than taking data from a database, rendering to pixels, and sending to screen memory, and then doing the whole process in reverse, why not just take the database directly without the intermediate stages?
Post 26 May 2016, 08:43
View user's profile Send private message Visit poster's website Reply with quote
sleepsleep



Joined: 05 Oct 2006
Posts: 12937
Location: ˛                             ⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣Posts: 0010456
sleepsleep 26 May 2016, 15:46
hi revolution,

there are 2 sets of data, first one output by active x control in ie browser, second one output by java applet,

i am thinking about OCRing to solve grab data from above 2 places in one time, i am not really sure how render pixels back to characters, and how about anti-aliased pixels, not really sure how to reverse them,

i need to store those OCR output in database and send them out through telegram bot api.
Post 26 May 2016, 15:46
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.