What is Docling?

Copy URL

Docling is an open source project and tool that converts documents into structured data a large language model (LLM) can use and learn from. 

Most organizational data exists in file formats that are hard for AI to read and extract data from. To use these documents in workflows like retrieval-augmented generation (RAG), fine-tuning, or agentic AI, they need to be converted into a format like Markdown, plain text, or JSON.

Simply put, Docling is a document-processing companion for generative AI. It bridges the gap between standard document formats and machine-centric data requirements. 

Explore Red Hat AI

YouTube video: How Docling turns documents into usable AI data (12:22)

Docling helps ensure that AI can use PDFs, presentations, spreadsheets, audio and video files, and others to get more accurate and transparent results. Docling is more than simple text extraction—it understands layout, hierarchy, and complex elements like multipage tables, images, bullet points, and headers. 

Docling’s ability to transform files simplifies data preparation by turning chaotic, multiformat documents into clean, LLM-ready knowledge pipelines. It solves the “garbage in, garbage out” dilemma of modern AI workflows—where the quality of an input determines the quality of the output—by retaining structure with help from deep learning algorithms. 

Docling converts files like PDF (Portable Document Format), DOCX (Microsoft Word documents), XLXS (Excel spreadsheets), PPTX (PowerPoint presentations), HTML, and more into a format that’s more AI-friendly.

Read more about Docling’s document-processing power

Docling and RAG

For a RAG system to be accurate, it needs access to data and the context surrounding that data. Before Docling, it was difficult to retain a document’s layout when pulling data from it—columns and rows might get jumbled, or a footnote could get separated from its paragraph. Docling preserves the structure of documents, which reduces hallucinations and keeps RAG systems more relevant. 

Docling and fine-tuning

Fine-tuning requires clean and consistently formatted examples of text. Random page numbers, repetitive headers, and broken line wraps can create confusion in data ingestion and errors in output. Think of Docling as an automated data-cleaning engine: It strips out the visual noise and creates clean, organized data for the model to ingest. 

Read about RAG vs. fine-tuning

Docling and MCP

Model Context Protocol (MCP) is an open source protocol that enables 2-way connection and standardized communication between AI applications and external services. It works like a universal translator between LLMs and tools. 

Docling’s official MCP server (the component that converts the document) can plug into your favorite MCP client (the AI application that requests access to external data or resources, like Claude Desktop, LM Studio, or Cursor). This connection lets your application use Docling’s capabilities, allowing you to extract information from your documents and convert data. No matter what LLM or agent you use, if it supports tool calling, you can use the Docling MCP server. 

4 key considerations for implementing AI technology

Docling processes each document step by step, moving it through a single pipeline:

  1. You feed your original PDF file to Docling.
  2. Docling parses the document, dissecting it to figure out what the different parts are. This is when Docling recognizes large, bold text as a header, a block of words as a paragraph, and a grid of numbers as a table. During this layout analysis, each part is labeled with a DocTag—a specialized, AI-readable code. 
  3. The document moves through a pipeline that applies Optical Character Recognition (OCR), table structure recognition, and further layout analysis. This portion of the pipeline is customizable, so you can add or replace models and introduce additional configuration parameters. 
  4. Docling gathers everything it learned about the document and generates a digital folder. A post-processing model polishes the document, organizing the pieces together in a clean Markdown or JSON file. This final file is known as a Docling Document. 

Does Docling support OCR and scanned documents?

Yes, Docling fully supports OCR, a technology that converts images or text into editable, searchable, and machine-readable data. OCR analyzes the light and dark areas of an image to identify letters and numbers. For example, if you take a picture of a receipt to upload for an expense report, OCR can read the numbers.

Docling vs. OCR 

OCR walked so Docling could run. Docling uses OCR as a foundation but has more advanced capabilities. 

OCR focuses on turning pixels into text. It can recognize numbers and letters but has a low understanding of layout, hierarchy, and formatting. For example, sometimes it mixes up columns and rows in a table. 

Docling takes OCR’s capabilities a step further by dissecting documents with a high understanding of layout, hierarchy, and formatting. Docling can recognize headers, paragraphs, charts, tables, equations, and lists. It keeps this organization in place when it creates an output as a Markdown or JSON file. 

Docling uses OCR in instances where raw text extraction is needed, like if the document is a blurry photocopy or a handwritten note. 

Can Docling run locally?

Yes, Docling runs entirely locally on your own machine by default. Because the data never leaves your machine, you can use Docling to parse sensitive documents. Docling’s ability to run locally is particularly helpful for keeping costs down. Because it uses your computer hardware rather than cloud-processing tools, you can feed Docling thousands of pages without running up a monthly bill.

Is Docling suitable for air-gapped environments?

Yes, Docling works well in highly secure, air-gapped environments. However, you must download Docling’s AI models while connected to the internet, transfer them to your secure network, and then set Docling to run in “offline mode.” Once that’s set up, you have an enterprise-grade document parser completely locked down from the outside world.

IBM Research originally created Docling as an internal tool. In April of 2025, IBM donated Docling to the Linux® Foundation AI & Data Foundation. This means Docling is free to use, vendor-neutral, and will evolve as the global open source community contributes to it.

Is Docling safe to use?

Docling is part of the Linux Foundation, a governing organization that provides legal protections and deploys cybersecurity tools to continuously scan code for vulnerabilities. This dual layer of legal and technical security provides peace of mind even for regulated environments like healthcare and financial industries.

Any organization with scattered documentation can benefit from using Docling. Consider these use cases:

Financial services
Many quarterly earnings reports, regulatory filings, and forms are packed with multicolumn text and sprawling tables. Docling can automate financial data extraction to isolate tables and convert them into Markdown grids. Instead of analysts spending hours manually copying and pasting numbers into spreadsheets, an AI agent can read the Docling Document output and instantly calculate metrics.

Legal document analysis and discovery
Law firms and legal departments manage thousands of contracts, files, and documents that often include scanned images, footnotes, and sidenotes that can confuse standard text readers. Docling meticulously tracks document hierarchy, making sure footnotes stay tied to their original sentences and remain contextually sound. This better prepares data for RAG, so lawyers can use an AI chat interface to ask complex questions and get accurate answers. 

Scientific research
Pharmaceutical companies and research institutions need to stay up-to-date with thousands of scientific journals, medical whitepapers, and patent filings to accelerate their research and development. Docling can ingest complex multicolumn academic literature, analyze mathematical formulas, and match figures to specific captions. This creates more accurate training data for fine-tuning domain-specific LLMs. As a result, medical AI tools can scan millions of pages of research simultaneously to spot new patterns in drug interactions or chemical structures.

Government and defense agencies
Government departments handle highly classified manuals, contracts, and documentation. This means they can’t upload these files to public cloud AI services like ChatGPT for analysis. Docling runs locally and supports air-gapped environments, allowing government agencies to build security-focused, in-house AI tools that scan and extract key technical information while safeguarding classified data. 

Connect models to data with Red Hat AI

Before you download Docling, make sure you installed Python on your machine and have enough hardware space to store Docling’s AI layout models and PyTorch, an open source deep learning framework. 

Once that’s in place, downloading Docling takes only 1 terminal command. From there, you can link it to your favorite AI chat interface and start dragging and dropping documents into your chat window for Docling to get to work.

To really appreciate the simplicity of Docling, it’s worth noting how historically time-consuming it was to extract data from a multiformat document. The process would include writing hundreds of lines of code to direct the processing logic, e.g., "If file ends in .pdf, do this. If it has an image, do that.” Thanks to Docling and MCP, this process has become much easier.

How to build an agentic application for Docling with MCP

Red Hat® AI is built for fast, flexible, and efficient inference through its vLLM-powered server. It reliably connects models to your data to unify the customization and development of specialized agents on a single platform. Together, Red Hat AI and Docling provide the tools needed to turn complex documents into useful intelligence. Since both are built on an open source foundation, you get full control of AI workflows from end-to-end at any scale. 

The Red Hat AI portfolio includes Red Hat AI Enterprise, a platform for deploying, managing, and scaling AI inference, agentic AI workflows, and AI-powered applications on any infrastructure.

Explore Red Hat AI

Blog

Artificial intelligence (AI)

See how our platforms free customers to run AI workloads and models anywhere.

Navigate AI with Red Hat: Expertise, training, and support for your AI journey

Discover how Red Hat Services can help you overcome AI challenges—no matter where you are in your AI journey—and launch AI projects faster.

Keep reading

How Kubernetes can help AI/ML

Kubernetes can assist with AI/ML workloads by making code consistently reproducible, portable, and scalable across diverse environments.

What is agentic AI?

Agentic AI is a software system designed to interact with data and tools in a way that requires minimal human intervention.

What is generative AI?

Generative AI is a kind of artificial intelligence technology that relies on deep learning models trained on large data sets to create new content.

Artificial intelligence resources

Featured product

  • Red Hat AI

    Flexible solutions that accelerate AI solution development and deployment across hybrid cloud environments.