Use this file to discover all available pages before exploring further.
If you’re working with legal documents, contracts, or any collaborative review process, you know how painful it is to manually track all the changes, comments, and revisions in Word documents.This guide shows you how to extract all that markup programmatically using the Track Changes API.
Tracked changes: insertions and deletions with author names and timestamps
Comments: all margin comments with author details
This allows you get a full revision history from your Word docs into clean HTML and Markdown.track_changes is perfect for legal workflows where you need to:
Generate redline summaries for clients
Identify all changes made by specific parties
Extract action items from comments
Analyze negotiation patterns across contract versions
Create audit trails of document revisions
Submit your Word document to the dedicated Track Changes endpoint. The output will be provided in Markdown and HTML format by default, with all tracked changes and comments preserved in the markup.
Here’s how to submit a Word document and extract its tracked changes using the REST API:
import requestsimport timeimport osAPI_URL = "https://www.datalab.to/api/v1/track-changes"API_KEY = os.getenv("DATALAB_API_KEY")def extract_tracked_changes(docx_path, output_format='html,markdown'): """ Extract tracked changes and comments from a Word document. Args: docx_path: Path to the .docx file output_format: 'html' or 'markdown' or `html,markdown` Returns: Dictionary with the converted content including tracked changes """ with open(docx_path, 'rb') as f: form_data = { 'file': (docx_path, f, 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'), 'output_format': (None, output_format), 'paginate': (None, False) # Set to True if you want page breaks } headers = {"X-API-Key": API_KEY} response = requests.post(API_URL, files=form_data, headers=headers) data = response.json() # Poll for completion check_url = data["request_check_url"] max_polls = 300 # Set longer if needed for i in range(max_polls): time.sleep(2) response = requests.get(check_url, headers=headers) result = response.json() if result["status"] == "complete": return result elif result["status"] == "failed": raise Exception(f"Conversion failed: {result.get('error')}") raise TimeoutError("Conversion did not complete in time")
The response will contain your document with all tracked changes preserved. Here’s what the markup looks like:
Once you have the extracted markup, you can use an LLM to analyze the changes.Here’s an example using OpenRouter to generate a redline summary:
import requestsimport osOPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY")OPENROUTER_MODEL = os.getenv("OPENROUTER_MODEL")def analyze_changes_with_llm(marked_up_content, analysis_type='summary'): """ Use an LLM via OpenRouter to analyze tracked changes. Args: marked_up_content: The HTML or Markdown with tracked changes analysis_type: Type of analysis ('summary', 'risks', 'by_author', etc.) Returns: LLM analysis of the changes """ prompts = { 'summary': """Analyze this contract with tracked changes and provide:1. A concise summary of all changes made2. Key changes that materially affect the agreement3. Any changes that shift risk or obligations between parties4. Recommended action items for legal reviewDocument with tracked changes:{content}""", 'by_author': """Review this document with tracked changes and create a report organized by author:- List each author's changes- Categorize changes as substantive vs. stylistic- Highlight any conflicting changes between authorsDocument:{content}""", 'risks': """Analyze this contract's tracked changes for potential legal risks:- Identify changes that increase liability or obligations- Flag any deletions of protective language- Note additions that could be problematic- Assess the overall risk profile of the revisionsDocument:{content}""" } prompt = prompts.get(analysis_type, prompts['summary']).format(content=marked_up_content) response = requests.post( url="https://openrouter.ai/api/v1/chat/completions", headers={ "Authorization": f"Bearer {OPENROUTER_API_KEY}", "Content-Type": "application/json" }, json={ "model": OPENROUTER_MODEL, "messages": [ { "role": "user", "content": prompt } ] } ) return response.json()['choices'][0]['message']['content']# Example usageresult = extract_tracked_changes('nda_draft_v3.docx', output_format='html')marked_up_doc = result['html']# Generate different types of analysissummary = analyze_changes_with_llm(marked_up_doc, 'summary')risk_analysis = analyze_changes_with_llm(marked_up_doc, 'risks')by_author = analyze_changes_with_llm(marked_up_doc, 'by_author')print("Change Summary:")print(summary)print("\n" + "="*80 + "\n")print("Risk Analysis:")print(risk_analysis)