Skip to the repo and start your own migration: https://github.com/patspace/confluence-to-azure-devops-wiki
Background
When I first started at 27Global in 2015, we used a self-hosted tool, Redmine, to manage our SDLC, tickets, documentation, and testing. A year later, in 2016, we moved to the Atlassian suite as we embraced Agile development. JIRA was a major step up from Redmine, helping us move from waterfall development to sprint and kanban boards. Confluence was quickly adopted as a flexible wiki we could reference, both for projects and for internal documentation.
In 2024, after several years working in different organization's Azure DevOps tenants, we decided to reduce our tool costs by moving our JIRA projects to ADO. Since we hold multiple service provider designations with Microsoft, our license costs are covered as part of our benefits. Justin Ford did a fantastic job auditing our projects, exporting them from JIRA, and importing them to ADO. While we still work with JIRA if it is a client instance, this consolidation saved the company money ($18k/year!) and better enabled standardization and reporting across the company. The JIRA to ADO migration was pretty straightforward - export work items, set up ADO project, import work items.
The next consolidation to tackle was Confluence to the Azure DevOps wiki. This was not as straightforward - we quickly realized that existing migration tools weren't meeting our specific needs. Open source utilities required a lot of manual editing, and paid tools were more expensive than we wanted to spend. What started as a simple data transfer project evolved into building a comprehensive, hierarchical migration tool with a web interface. Here's the story of how we developed our tool - a robust migration solution that preserves content structure, handles attachments, and provides a user-friendly experience. We used claude code in vscode to create the tool, and claude is a co-author on the repo.
The Challenge: More Than Just Moving Content
- Preserve hierarchical page structure: Our Confluence space had deeply nested pages that needed to maintain their parent-child relationships
- Handle Azure DevOps Wiki naming conventions: Spaces must be replaced with hyphens, special characters sanitized
- Migrate all attachments and images: Including proper path resolution and deduplication
- Batch processing for large datasets: Our space contained hundreds of pages with numerous attachments
- User-friendly interfaceNon-technical users needed to be able to run migrations
Key Features We Built
1. Hierarchical Structure Preservation
The core challenge was maintaining the logical organization of our content. Our final solution confluence_migration_corrected.py:113-212
builds a complete wiki structure that respects Azure DevOps Wiki's folder-based hierarchy.
2. Azure DevOps Wiki Compliance
One of our biggest early challenges was understanding Azure DevOps Wiki's strict naming requirements. After several failed migrations, we implemented proper filename sanitization confluence_migration_corrected.py:44-62
3. Smart Batch Processing
Large migrations were hitting Azure DevOps' 25MB commit limits. We developed a sophisticated size-based batching system confluence_migration_corrected.py:654-815
that:
- Calculates content sizes before committing
- Groups files into optimally-sized batches
- Handles large individual files appropriately
- Provides detailed progress tracking
4. Web Interface for Accessibility
Recognizing that migration tools need to be accessible to non-developers, we built a Flask-based web UI (`web_ui.py`) that provides:
- Connection validation before migration
- Real-time migration progress tracking
- Error handling and detailed logging
Development Challenges and Solutions
Challenge 1: Content Truncation Issues
Problem: Early versions were accidentally truncating content when processing Confluence's complex image markup.
Solution: We refined our regex patterns to use non-greedy matching and properly handle nested XML structures confluence_migration_corrected.py:312-331
Challenge 2: Azure DevOps API Limits
Problem: The Azure DevOps Git API has strict limits on commit size (25MB) and timeout constraints.
Solution: Implemented intelligent chunking that pre-calculates sizes and distributes content across multiple commits while maintaining atomicity.
Challenge 3: Image Deduplication and Path Management
Problem: Multiple pages referencing the same images created duplication and broken links.
Solution: Built a comprehensive image tracking system that maps local files to Azure paths and prevents duplicate uploads confluence_migration_corrected.py:434-460
.
Challenge 4: Error Recovery and Debugging
Problem: Failed migrations were difficult to debug, especially when processing hundreds of pages.
Solution: Added extensive logging, progress tracking, and utility functions migration_utilities.py
for validation and single-page testing.
Architecture Decisions
Modular Design: We separated concerns into distinct modules:
- `confluence_migration_corrected.py` - Core migration logic
- `migration_utilities.py` - Validation and helper functions
- `web_ui.py` - User interface and API endpoints
Configuration Management: Used dictionary-based configuration to make the tool flexible across different environments:
Robust Error Handling: Implemented comprehensive exception handling at every API interaction point, with graceful degradation and detailed error reporting.
Lessons Learned
1. API Documentation vs Reality: Azure DevOps and Confluence documentation sometimes didn't match the actual behavior. We spent considerable time understanding undocumented requirements (like the space-to-hyphen naming convention).
2. Incremental Development is Key" Rather than building a monolithic solution, our modular approach allowed us to test and refine individual components. The utility module was particularly valuable for debugging specific issues.
3. User Experience Matters: Technical tools are only useful if people can actually use them. The web interface transformed this from a developer-only tool into something our content managers could operate independently.
4. Plan for Scale: Our early versions worked fine for small tests but failed on production data. Building size-aware batching from the start would have saved significant debugging time.
Results and Impact
The final tool successfully:
- Migrated 40+ spaces with complex hierarchical relationships
- Processed numerous images and attachments without loss
- Maintained content fidelity through HTML to Markdown conversion
- Provided a reliable, repeatable migration process
- Enabled non-technical users to perform migrations independently
Conclusion
Building this tool taught us that successful migration tools require more than just moving data - they need to understand the nuances of both source and target systems, handle edge cases gracefully, and provide a great user experience. The iterative development approach, extensive testing with real data, and focus on modularity all contributed to creating a tool that not only solved our immediate problem but could be adapted for similar migrations. The key to our success was treating this not as a one-off script, but as a proper software project with proper architecture, error handling, and user interface design.
The process also taught us that Claude Code today is more than capable of solving internal projects with a competent captain at the helm. We've also built slackbots, integrations, functions, scripts, prototypes, and proof-of-concepts for many different internal use cases. The AI tools continue to leapfrog each other on a regular basis, and we encourage readers to try multiple options, figure out what works, and be prepared to pivot to a better option when it appears.
---
The complete source code for this migration tool is available in our repository here: https://github.com/patspace/confluence-to-azure-devops-wiki. Whether you're facing a similar migration challenge or just interested in API integration patterns, we hope our journey helps inform your approach.