ExtracText OCR For Confluence

Looking for Confluence Cloud documentation? visit here

ExtracText OCR For Confluence is an Atlassian Confluence app that brings the power of image analysis and OCR (Optical Character Recognition) in to Confluence attachments. More specifically ExtracText gets the text inside images in different file formats like png, jpg, jpeg, bmp, pdf and adds it to Confluence search index. This allows users to search and find their attachments more easily. It also works for scanned documents. ExtracText for Confluence now has a image editor that allows to add title and annotations to images very easily. All your annotations will get indexed as well for searching later. 


  • Most of the text is identified and extracted from images unless the image is noisy

  • Support for up to 11 languages (more to come)

  • Text can be in light or dark colors

  • Text is extracted from any kind of screenshots, architecture diagrams, flowcharts etc

  • Extracted text is added to Confluence search index, makes it very easy to find the content

  • Provides a way to view and copy all the text in any image with just one click

  • Comes with an image editor that allows for adding annotations easily


To see it in action on this Confluence page, hover over any of the below images and click on the "ExtracText" button or use the top Confluence search bar to search for any text that you can see inside the images below. You can also visit the attachments page and click "ExtracText" button to view the text.


Image 1 (Some standard text):

Image 2 (A web page with different font sizes):

Image 3 (A web page with dark and light text and graphics):


Image 4 (Some text with dark and light colors):

Image 5 (scanned PDF document):