Skip to main content
Forge Evals is a powerful tool for evaluating and comparing different parsing configurations across multiple documents. Use it to determine which settings work best for your specific document types and use cases.

What is Forge Evals?

Forge Evals allows you to:
  • Upload up to 10 documents at once
  • Test up to 5 different parsing configurations simultaneously
  • Compare results side-by-side with visual diff highlighting
  • Identify the optimal parsing settings for your document types
This is particularly useful when you need to:
  • Determine which parsing mode (Fast, Balanced, or Accurate) works best for your documents
  • Evaluate special features like Track Changes or Chart Understanding
  • Compare parsing results across different document types
  • Optimize for speed vs. accuracy trade-offs

Getting started

Access Forge Evals at https://www.datalab.to/app/evals

Step 1: Upload documents

Upload the documents you want to evaluate. You can:
  • Drag and drop files directly into the upload zone
  • Click to browse and select files
  • Upload up to 10 documents per evaluation session
Supported formats: PDF, DOCX, XLSX, PPTX, images, and more. See supported file types for the complete list.
Spreadsheet files (XLS, XLSX, CSV, ODS) are processed automatically without additional configuration options.

Step 2: Select configurations

Choose which parsing configurations to test. You can select from preset configurations or create custom ones.

Preset configurations

  • Fast Mode: Lowest latency, great for real-time use cases
  • Balanced Mode: Balanced accuracy and latency, works well with most documents
  • Accurate Mode: Highest accuracy and latency, good for complex documents
  • Track Changes: Extract tracked changes from DOCX files (DOCX only)
  • Chart Understanding: Extract data from charts and graphs

Custom configurations

Create custom configurations to test specific combinations of:
  • Processing mode (Fast, Balanced, or Accurate)
  • Page range selection
  • Special features (Track Changes, Chart Understanding)
  • Output options (pagination, headers, footers)
Track Changes only works with DOCX files. The grid will show “N/A” for incompatible document/configuration combinations.

Step 3: Run evaluation

Click “Start Evaluation” to begin processing. The system will:
  1. Process each document with each selected configuration
  2. Display progress in a grid view
  3. Show completion status and processing time for each run
You can:
  • Monitor progress in real-time
  • Cancel all runs if needed
  • Retry failed runs

Step 4: Compare results

Once runs complete, click any two cells in the grid to compare their results side-by-side. The comparison view shows:
  • Parallel view: Full documents side-by-side with inline diff highlighting
  • Multiple output formats: Switch between Markdown, HTML, and JSON
  • Processing metrics: Duration and configuration details for each run
  • Diff statistics: Lines added, removed, and changed
Use the “Switch Runs” button to select different runs for comparison without leaving the comparison view.

Excluding runs

Right-click any cell in the grid to exclude that specific document/configuration combination from running. This is useful when:
  • You know certain configurations won’t work for specific documents
  • You want to reduce the total number of runs
  • You need to focus on specific comparisons
Excluded cells appear with a yellow background and can be re-included by clicking them again.

Best practices

Choosing configurations

  • Start with the three preset modes (Fast, Balanced, Accurate) to establish a baseline
  • Add Track Changes if you’re working with DOCX files that contain revisions
  • Add Chart Understanding if your documents contain charts or graphs
  • Create custom configurations to test specific parameter combinations

Document selection

  • Include representative samples of your document types
  • Test edge cases (complex layouts, mixed content, etc.)
  • Keep document count manageable (3-5 documents is often sufficient)

Interpreting results

  • Compare processing times to understand speed/accuracy trade-offs
  • Use the diff view to identify where configurations produce different outputs
  • Pay attention to “N/A” cells indicating incompatible combinations
  • Look for patterns across similar document types

Limitations

  • Maximum 10 documents per evaluation session
  • Maximum 5 run configurations per session
  • Track Changes feature only works with DOCX files
  • Spreadsheet files use automatic configuration (no mode selection)

Custom evaluations

For larger document sets or custom evaluation needs, contact us to discuss enterprise evaluation options.