Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.verlon.ai/llms.txt

Use this file to discover all available pages before exploring further.

The ocr() method extracts text, tables, and structured data from images and PDFs using optical character recognition.

Basic OCR

Extract text from an image URL:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/document.jpg'
  }
});

console.log('Extracted text:', response.ocr.pages[0].markdown);
console.log('Pages:', response.ocr.pages.length);
console.log('Cost:', response.cost);

Input Methods

Image URL

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/receipt.jpg'
  }
});

Document URL (PDFs)

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/invoice.pdf'
  }
});

// Multi-page PDF
response.ocr.pages.forEach((page, i) => {
  console.log(`Page ${i + 1}:`, page.markdown);
});

Base64 Images

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    base64: 'data:image/png;base64,iVBORw0KGgoAAAANS...',
    mimeType: 'image/png'
  }
});

Table Extraction

Extract tables in different formats:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/financial-report.pdf',
    tableFormat: 'markdown'  // 'markdown' or 'html'
  }
});

// Access tables
response.ocr.pages.forEach(page => {
  if (page.tables) {
    page.tables.forEach(table => {
      console.log('Table (HTML):', table.html);
    });
  }
});

Table Format Options

FormatUse Case
markdownSimple display, documentation
htmlWeb rendering, styling
Use markdown format for quick display or LLM processing. Use html for web applications where you need custom styling.

Image Extraction

Extract embedded images from documents:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/presentation.pdf',
    includeImageBase64: true
  }
});

// Access embedded images
response.ocr.pages.forEach(page => {
  if (page.images) {
    page.images.forEach(img => {
      console.log('Image ID:', img.id);
      console.log('Image data:', img.base64);
      console.log('Position:', img.topLeftX, img.topLeftY);
    });
  }
});

Headers and Footers

Extract headers and footers:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/contract.pdf',
    extractHeader: true,
    extractFooter: true
  }
});

response.ocr.pages.forEach(page => {
  console.log('Header:', page.header);
  console.log('Footer:', page.footer);
  console.log('Main content:', page.markdown);
});

Parameters

Request Parameters

ParameterTypeDefaultDescription
documentUrlstring-PDF document URL
imageUrlstring-Image URL
base64string-Base64-encoded image/document
mimeTypestring-MIME type for base64
tableFormat'markdown' | 'html'markdownTable output format
includeImageBase64booleanfalseExtract embedded images
extractHeaderbooleanfalseExtract page headers
extractFooterbooleanfalseExtract page footers

Response

interface OCRResponse {
  id: string;                    // Request ID
  model: string;                 // Model used
  ocr: OCROutput;                // Extracted data
  cost: number;                  // Request cost in USD
  latency: number;               // Response time in ms
}

interface OCROutput {
  pages: OCRPage[];              // Extracted pages
  model: string;                 // OCR model used
  documentAnnotation?: object;   // Document metadata
  usageInfo?: {
    pagesProcessed?: number;
    docSizeBytes?: number;
  };
}

interface OCRPage {
  index: number;                 // Page number
  markdown: string;              // Extracted text
  images?: Array<{
    id: string;
    base64?: string;
    topLeftX?: number;
    topLeftY?: number;
    bottomRightX?: number;
    bottomRightY?: number;
  }>;
  tables?: Array<{
    id: string;
    html?: string;
  }>;
  hyperlinks?: Array<{
    text: string;
    url: string;
  }>;
  header?: string | null;
  footer?: string | null;
  dimensions?: {
    width: number;
    height: number;
    dpi?: number;
  };
}

Best Practices

1. Use High-Quality Images

// Good ✅ - Clear, high-resolution scan
{
  imageUrl: 'https://example.com/high-res-scan.jpg'
}

// Less effective - Low resolution, blurry
{
  imageUrl: 'https://example.com/phone-photo.jpg'
}

2. Choose Appropriate Input Method

// Good ✅ - PDF for multi-page documents
{
  documentUrl: 'https://example.com/report.pdf'
}

// Good ✅ - Image URL for single pages
{
  imageUrl: 'https://example.com/receipt.jpg'
}

// Good ✅ - Base64 for uploaded files
{
  base64: uploadedImageData,
  mimeType: 'image/jpeg'
}

3. Extract Only What You Need

// Good ✅ - Minimal extraction for better performance
{
  documentUrl: 'url',
  tableFormat: 'markdown'
}

// Less efficient - Extracting everything when not needed
{
  documentUrl: 'url',
  tableFormat: 'html',
  includeImageBase64: true,
  extractHeader: true,
  extractFooter: true
}

4. Always Handle Errors

try {
  const response = await verlon.ocr({
    gateId: 'your-gate-id',
    data: {
      documentUrl: 'https://example.com/doc.pdf'
    }
  });

  return response.ocr.pages[0].markdown;
} catch (error) {
  console.error('OCR failed:', error);
  return null;
}

5. Process Pages Efficiently

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/large-doc.pdf'
  }
});

// Process pages in parallel if needed
const processedPages = await Promise.all(
  response.ocr.pages.map(async (page) => {
    // Process each page
    return processPage(page.markdown);
  })
);

Examples

Invoice Processing

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  metadata: { type: 'invoice' },
  data: {
    documentUrl: 'https://example.com/invoice.pdf',
    tableFormat: 'markdown',
    extractHeader: true
  }
});

// Extract invoice data
const invoiceText = response.ocr.pages[0].markdown;

// Use LLM to structure the data
const structured = await verlon.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract invoice number, date, total, and line items as JSON'
      },
      {
        role: 'user',
        content: invoiceText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

const invoiceData = JSON.parse(structured.content);
console.log('Invoice #:', invoiceData.invoiceNumber);
console.log('Total:', invoiceData.total);

Receipt Scanning

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/receipt.jpg'
  }
});

const receiptText = response.ocr.pages[0].markdown;

// Extract key information
const items = receiptText.match(/\$\d+\.\d{2}/g);
console.log('Prices found:', items);

Document Digitization

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/archive-doc.pdf',
    includeImageBase64: true,
    extractHeader: true,
    extractFooter: true
  }
});

// Save to database
for (const page of response.ocr.pages) {
  await db.pages.create({
    pageNumber: page.index,
    content: page.markdown,
    header: page.header,
    footer: page.footer,
    images: page.images
  });
}

Form Processing

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/application-form.jpg'
  }
});

const formText = response.ocr.pages[0].markdown;

// Extract form fields with LLM
const formData = await verlon.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract form fields: name, email, phone, address as JSON'
      },
      {
        role: 'user',
        content: formText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

Table Extraction for Analysis

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/financial-report.pdf',
    tableFormat: 'html'
  }
});

// Extract all tables
const allTables = response.ocr.pages.flatMap(page =>
  page.tables || []
);

console.log(`Found ${allTables.length} tables`);

// Process tables
allTables.forEach(table => {
  // Parse HTML table and analyze
  parseHTMLTable(table.html);
});

ID Card Scanning

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/drivers-license.jpg'
  }
});

const idText = response.ocr.pages[0].markdown;

// Extract structured data
const idInfo = await verlon.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract: name, license number, date of birth, expiration date as JSON'
      },
      {
        role: 'user',
        content: idText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

Advanced Usage

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/article.pdf'
  }
});

response.ocr.pages.forEach(page => {
  if (page.hyperlinks) {
    console.log('Links found:');
    page.hyperlinks.forEach(link => {
      console.log(`${link.text} -> ${link.url}`);
    });
  }
});

Page Dimensions

const response = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/doc.pdf'
  }
});

response.ocr.pages.forEach(page => {
  if (page.dimensions) {
    console.log(`Page ${page.index + 1}:`);
    console.log(`  Size: ${page.dimensions.width}x${page.dimensions.height}`);
    console.log(`  DPI: ${page.dimensions.dpi}`);
  }
});

Override Model

Use a specific OCR model:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  model: 'textract-analyze',
  data: {
    documentUrl: 'https://example.com/complex-doc.pdf'
  }
});

Custom Metadata

Track OCR processing:
const response = await verlon.ocr({
  gateId: 'your-gate-id',
  metadata: {
    userId: 'user-123',
    documentType: 'invoice',
    department: 'accounting'
  },
  data: {
    documentUrl: 'https://example.com/invoice.pdf'
  }
});

Use Cases

Document Management Systems

Digitize and index documents:
const documents = await getDocuments();

for (const doc of documents) {
  const response = await verlon.ocr({
    gateId: 'your-gate-id',
    data: {
      documentUrl: doc.url,
      tableFormat: 'markdown'
    }
  });

  // Index for search
  await searchIndex.add({
    id: doc.id,
    content: response.ocr.pages.map(p => p.markdown).join('\n')
  });
}

Expense Management

Automate expense report processing:
const receiptOCR = await verlon.ocr({
  gateId: 'your-gate-id',
  data: { imageUrl: receiptUrl }
});

const expense = await verlon.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [{
      role: 'user',
      content: `Extract merchant, date, total, category from: ${receiptOCR.ocr.pages[0].markdown}`
    }],
    responseFormat: { type: 'json_object' }
  }
});
Extract and verify contract terms:
const contractOCR = await verlon.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: contractPdfUrl,
    extractHeader: true,
    extractFooter: true
  }
});

// Search for specific clauses
const hasArbitrationClause = contractOCR.ocr.pages.some(page =>
  page.markdown.toLowerCase().includes('arbitration')
);

Next Steps

Chat

Process OCR output with LLMs

Embeddings

Create searchable document index

Gates & Routing

How Verlon routes OCR requests

Cost Tracking

Monitor OCR costs