Screenshot to text recognition OCR

Dimlos

Well-known member
I found out how to do OCR for Portuguese.
In the example, the results are displayed on the screen, but you can also export the results to a file or write a script to run it from MacroDroid via Termux:Tasker.
If the folder where the screenshots are stored is DCIM or in another language, please be creative.

1.Install Termux from F-Droid
2.apt update
3.apt upgrade
4.pkg install tesseract
5.termux-setup-storage
6.Download language files from https://github.com/tesseract-ocr/tessdata_best (por.traineddata)
7.cd ~/storage/downloads
8.cp por.traineddata /data/data/com.termux/files/usr/share/tessdata
9.cd
10.Take a screenshot
11.tesseract ~/storage/pictures/Screenshots/Screenshot_20230614-193424.png stdout -l por+eng
(if the file name is Screenshot_20230614-193424.png)
 

KingKwab

New member
Hi All!

I've tried following the steps posted about by Dimlos and even recognized that my images were being saved to my DCIM and made the necessary changes. However, I still cannot get the text to appear in the local variable.

I believe it has something to do with the OCR patching, but I have no idea what/how I would need to do to correct this. I've added as many screenshots as I can to showcase.

Thank you in advance!
 

Attachments

  • Screenshot_2023-11-24-04-16-09-527_com.arlosoft.macrodroid.jpg
    Screenshot_2023-11-24-04-16-09-527_com.arlosoft.macrodroid.jpg
    480.7 KB · Views: 17
  • Screenshot_2023-11-24-04-22-54-241_com.arlosoft.macrodroid.jpg
    Screenshot_2023-11-24-04-22-54-241_com.arlosoft.macrodroid.jpg
    501.9 KB · Views: 16
  • Screenshot_2023-11-24-04-23-03-828_com.arlosoft.macrodroid.helper.jpg
    Screenshot_2023-11-24-04-23-03-828_com.arlosoft.macrodroid.helper.jpg
    480.3 KB · Views: 13
  • Screenshot_2023-11-24-04-23-39-358_com.arlosoft.macrodroid.jpg
    Screenshot_2023-11-24-04-23-39-358_com.arlosoft.macrodroid.jpg
    404.2 KB · Views: 12
  • Screenshot_2023-11-24-04-23-45-604_com.balda.touchtask.jpg
    Screenshot_2023-11-24-04-23-45-604_com.balda.touchtask.jpg
    213 KB · Views: 14

Dimlos

Well-known member
For Xiaomi, the path is different, so please refer to the screenshot to set it up.
You need to grant access to the DCIM directory in Grant access to primary storage in TouchTask.
 

Attachments

  • TouchTask.jpg
    TouchTask.jpg
    228.6 KB · Views: 12

KingKwab

New member
This was 100% it! Along with the full setup of my Helper (I had done most of them, but failed to realize that they needed to all be completed - it was very late LOL)

I really appreciate your assistance on this. I've been able to create follow on tasks because this now works! I do however have a follow on question that I have yet to solve.

I understand that TouchTask will not directly read graphic imagery unless it is screenshotted and reviewed through the OCR Action in your shell script, and this works perfectly for the graphics I require it to read. My question is 'Is there a way to create a trigger action in Macrodroid to scans the page and checks for a change, then if the change is found runs the sets of actions below?'

  1. Scan for trigger parameter (either a pixel, text, image, popup) in a specific location <- This needs to be a constant
  2. Run Actions
  3. When Actions complete, End
Task 1 above is what I cannot seem to solve correctly - I've tried using the Screen Update feature of TouchTask, but either this is not the correct choice or I am inputting the wrong values. Any assistance would be grateful!
For Xiaomi, the path is different, so please refer to the screenshot to set it up.
You need to grant access to the DCIM directory in Grant access to primary storage in TouchTask.
 

sampleuserhere

Active member
TouchTask has Query Image action, which you can use to crop image, extract pixel color, and compare two images. It doesn't seem like it's totally impossible to do a couple of simple comparison.

Here's some attempts made by a redditor in r/automateuser ( read the comments )

https://www.reddit.com/r/AutomateUser/comments/115hlw7
Anyway, it may be better if you "disclose" what's your actual end goal is. It helps the user here to give you a hand once they know what's exactly you're trying to do.

 

KingKwab

New member
Fair point! I was aiming to create a script to automate an Idle Game that I play - the sequences are pretty straightforward and the only variation I require would be manual input on changing characters (something I am not trying to automate at all, lol).

Since the game is an idle/tower climb, I was hoping to use OCR to find any of the triggers that would begin a new battle. After hours of reading and testing, I figured it's probably best if I just use a Wait Action at the end of my macro, then loop the macro to take another screenshot and control which path is taken through If statements.
 

cza93

New member
I found out how to do OCR for Portuguese.
In the example, the results are displayed on the screen, but you can also export the results to a file or write a script to run it from MacroDroid via Termux:Tasker.
If the folder where the screenshots are stored is DCIM or in another language, please be creative.

1.Install Termux from F-Droid
2.apt update
3.apt upgrade
4.pkg install tesseract
5.termux-setup-storage
6.Download language files from https://github.com/tesseract-ocr/tessdata_best (por.traineddata)
7.cd ~/storage/downloads
8.cp por.traineddata /data/data/com.termux/files/usr/share/tessdata
9.cd
10.Take a screenshot
11.tesseract ~/storage/pictures/Screenshots/Screenshot_20230614-193424.png stdout -l por+eng
(if the file name is Screenshot_20230614-193424.png)
Great! Worked perfectly for me. Tks!

I had already given up. hehe
 

cza93

New member
This was 100% it! Along with the full setup of my Helper (I had done most of them, but failed to realize that they needed to all be completed - it was very late LOL)

I really appreciate your assistance on this. I've been able to create follow on tasks because this now works! I do however have a follow on question that I have yet to solve.

I understand that TouchTask will not directly read graphic imagery unless it is screenshotted and reviewed through the OCR Action in your shell script, and this works perfectly for the graphics I require it to read. My question is 'Is there a way to create a trigger action in Macrodroid to scans the page and checks for a change, then if the change is found runs the sets of actions below?'

  1. Scan for trigger parameter (either a pixel, text, image, popup) in a specific location <- This needs to be a constant
  2. Run Actions
  3. When Actions complete, End
Task 1 above is what I cannot seem to solve correctly - I've tried using the Screen Update feature of TouchTask, but either this is not the correct choice or I am inputting the wrong values. Any assistance would be grateful!
I have a similar problem. I need to get the coordinates from an icon on the screen (there is nothing when reading the text on the screen) and then click on it.

First, something like an image recognition trigger.
Then identify the xy coordinates;
Then click on the result.
 

HCC

Member
Before using the Shell Script command ..
Ls -rt /storage/emulated/0/Pictures/Screenshots | tail -n 1
go to :
(Android 13=) Settings / [search for: permissions] / App permissions manager / photos and videos / Macrodroid
(Android 7=) Settings / [search for: permissions] / Storage / Macrodroid

..and allow permissions for Macrodroid.
-Without the permission the Shell Script result variable will always be a null string
🤔
.

Ls -rt : How to list just THE LAST file
 

HCC

Member
Even forcing the language to english (mine is portuguese) it doesn't work.

View attachment 5791
I think the string is truncated (posting #9 - 2nd image) . -Instead..
content://com.android.externalst
it must be:
content://com.android.externalstorage.documents/document/primary%3APictures%2FScreenshots%2F%image

Stackoverflow.com : "what is com android externalstorage"

Edit: If the path to your ocr image file is :
/storage/emulated/0/Pictures/Screenshots/test.png
then the TouchTask path would be :
content://com.android.externalstorage.documents/document/primary%3APictures%2FScreenshots%2Ftest.png

%image represents a variable and test.png represents a constant.
 
Last edited:
Top