Use Bluebeam OCR to make scanned text selectable and searchable

This post is part of a tutorial on how to turn scanned papers into navigable PDF documents.

After you've scanned your paper documents into PDF, you will want to make the text selectable searchable. The good news is you can do this with the click of a button using Bluebeam Revu's OCR (optical character recognition) feature. OCR essentially scans the pixels on your PDF document to identify any text you have on there. You can run OCR on individual PDFs, or on an entire folder of PDFs at once through the Batch menu. There was a significant improvement to OCR with the release of Revu 12, so it runs faster and more accurately than ever.

Before we get started, keep in mind that you will need Revu eXtreme to use this feature. You can see a detailed feature comparison of the Revu versions here.

Another important thing to note is that while OCR is very good at identifying most fonts, it may have problems with some unusual or artsy types (ex: cursive script, old english fonts, etc). If your scanned PDFs are very low resolution and really grainy, that can reduce its accuracy as well. You may not always have control over what paper documents you get to work with, but whenever possible try to stick to common fonts and scan at a moderate resolution.

With those disclaimers out of the way, let's begin:

1. To run OCR on a single PDF, first open it up. Go to the Document menu, where you will see the OCR button. Click on that.

2. You will then be taken to a window where you can adjust the OCR settings to your liking, such as running OCR on a specific page range or the entire document. A handy setting to take note of is the Max Vector Size setting. This will make Revu automatically disregard anything over that size whenever running OCR, which will make running OCR on drawings go much faster. 

3. If you would like to run OCR on an entire folder of PDFs, you can run it as a batch process. Go to the File menu, click the Batch icon and the first option will be OCR. 

4. After selecting batch OCR, you will be taken to the next window where you can select your desired files. You can simply Add Open Files, or click Add to select other groups of files or entire folders of files to the batch OCR process. You can also adjust settings to run OCR on specific page ranges, odd/even pages, or only ones of a certain orientation (landscape or portrait).

5. After running standard or batch OCR, all of that scanned text is selectable and searchable, making it far more useful for us. If you would like to search a PDF for a word or phrase, you can jump quickly to the Search tab by using the Tab Access menu. The Tab Access menu is accessible by clicking on the orange down arrow found in the top left corner of Revu's interface. This is also a great way to quickly access any other tab you may need.

6. In the Search tab, input your desired text into the search field and click Search. Your results will be displayed at the bottom, and you can select them either one at a time or many at once.

4. Once again, you can do this with an entire folder of PDFs if you'd like. The process is going to be the same, except this time make sure you change the Search In field to Folder, instead of Current Document. After selecting Folder, you will be prompted to select the desired folder to run your search. 

In addition to simply finding instances of a certain phrase, you can also apply various settings to one, many, or all of the found instances. Examples would be hyperlinking every instance, highlighting them, or redacting all of them (such as removing social security numbers).

 


Bohdee Staff
Bohdee Staff

Author