First of all you must have command line expertise to use this open source OCR software
At the beginning we are going to install Tesseract on Ubuntu
Open your terminal and write the following command
root@nur-HP:~#apt-get install tesseract-ocr
It will install OCR on your Ubuntu Operating System. Then install your desire language packages. Remember you do not to install English language package because it already installed with tesseract installation.
Here, I going to install Bangla language package
apt-get install tesseract-ocr-[lang]
root@nur-HP:~#apt-get install tesseract-ocr-ben (This command will install Bangla language package)
If you like to install All language packages, try the following command
root@nur-HP:~#apt-get install tesseract-ocr-all
Our installation has completed. Now we are going to use it
tesseract [image_path] [file_name]
sample command:
root@nur-HP:~#tesseract /home/nurahammad/Dropbox/ForOCR/IMG_20171201_161244.jpg /home/nurahammad/Desktop/test
If you like to see the result on terminal, try below command
tesseract [image_path] stdout
root@nur-HP:~# tesseract /home/nurahammad/Dropbox/ForOCR/IMG_20171201_161244.jpg stdout
I think it will help you for processing your Repository/Digital Library files