How to Migrate from SVN to Git
Diagram: migrate repo from SVN to Git - preserve history, convert branches & tags, map authors, resolve conflicts, test locally, push to remote, update CI and documentation. fully.
Understanding the Critical Shift in Version Control
The transition from Subversion (SVN) to Git represents more than just switching tools—it fundamentally changes how development teams collaborate, manage code, and maintain project history. Organizations worldwide are making this migration not because SVN has become obsolete, but because Git's distributed architecture offers flexibility, speed, and workflows that align better with modern development practices. Whether you're part of a small team or managing enterprise-level repositories, understanding this migration process can save countless hours and prevent data loss that could set projects back significantly.
Version control migration isn't simply about moving files from one system to another. SVN operates on a centralized model where a single repository serves as the source of truth, while Git embraces a distributed approach where every developer maintains a complete copy of the repository history. This philosophical difference affects everything from branching strategies to collaboration patterns, making the migration both a technical challenge and an organizational transformation that requires careful planning and execution.
Throughout this comprehensive guide, you'll discover the preparation steps needed before migration, multiple approaches to transferring your repository with complete history preservation, strategies for handling complex repository structures, and post-migration best practices. You'll also learn how to address common challenges like preserving author information, managing large binary files, and restructuring repository layouts to leverage Git's strengths. By the end, you'll have actionable knowledge to execute a successful migration that maintains data integrity while positioning your team for improved productivity.
Preparing Your Environment for Migration
Before initiating any migration process, establishing the right environment ensures smooth execution and minimizes potential complications. The preparation phase involves installing necessary tools, analyzing your existing SVN repository structure, and making critical decisions about what to migrate and how to organize it in Git.
Essential Tools and Software Requirements
The migration process requires specific tools that bridge SVN and Git systems. At minimum, you'll need Git installed on your system, along with git-svn, a bidirectional operation tool between Subversion and Git repositories. Most modern Git installations include git-svn by default, but verifying its availability prevents mid-migration surprises.
- 🔧 Git (version 2.0 or higher recommended for optimal compatibility)
- 🔧 git-svn (enables Git to interact with SVN repositories)
- 🔧 SVN client (for accessing and analyzing your source repository)
- 🔧 Text editor (for creating author mapping files)
- 🔧 Sufficient disk space (at least 3x your SVN repository size)
Beyond basic tools, consider migration utilities like svn2git or SubGit for more complex scenarios. These specialized tools handle edge cases better than manual approaches and often include features for dealing with large repositories, complex branching structures, or repositories with inconsistent histories.
Analyzing Your SVN Repository Structure
Understanding your existing repository architecture determines which migration approach works best. SVN repositories typically follow standard layouts with trunk, branches, and tags directories, but many organizations have customized structures that require special handling during migration.
"The biggest migration mistakes happen when teams assume their repository follows standard conventions without actually verifying the structure first."
Begin by examining your repository's top-level structure. Use SVN commands to list directories and understand how your team has organized code over time. Pay particular attention to non-standard layouts, nested projects within a single repository, or directories that don't fit the typical trunk/branches/tags pattern.
Document any irregularities you discover, including abandoned branches, experimental directories, or vendor code stored directly in the repository. These elements require decisions about whether to migrate them, archive them separately, or exclude them entirely from the Git repository.
Creating an Author Mapping File
SVN and Git handle author information differently. SVN typically stores simple usernames, while Git requires full name and email address combinations. Creating an accurate author mapping file ensures proper attribution in your Git history and maintains accountability for code changes.
Extract all unique authors from your SVN repository using this command:
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <email@example.com>"}' | sort -u > authors-transform.txtThis generates a file listing all SVN usernames. Edit this file to map each username to a proper Git identity format. For example, transform entries like jsmith into jsmith = John Smith <john.smith@company.com>. Accuracy here matters tremendously—incorrect mappings become permanent parts of your Git history and require complex rewriting to fix later.
Executing the Basic Migration Process
With preparation complete, the actual migration involves cloning your SVN repository into Git format while preserving complete history. The git-svn tool provides the primary mechanism for this conversion, offering options to handle different repository layouts and migration scenarios.
Standard Layout Migration
For repositories following SVN's conventional trunk/branches/tags structure, the migration process is relatively straightforward. The git-svn clone command with standard layout flags handles the conversion automatically:
git svn clone --stdlayout --authors-file=authors-transform.txt https://svn.example.com/repository/path local-git-repoThis command initiates a complete repository clone, converting each SVN revision into a Git commit. The --stdlayout flag tells git-svn to expect trunk, branches, and tags directories at the repository root. The process can take considerable time for repositories with extensive histories—hours or even days for large codebases with thousands of commits.
During execution, git-svn processes revisions sequentially, maintaining chronological order and preserving the complete commit history. You'll see progress indicators showing revision numbers as they're processed. Don't interrupt this process; doing so requires starting over from the beginning.
Custom Layout Migration
Repositories with non-standard structures require explicit path specifications. Instead of --stdlayout, use individual flags to identify where trunk, branches, and tags exist:
git svn clone --trunk=main --branches=development --tags=releases --authors-file=authors-transform.txt https://svn.example.com/repository/path local-git-repoThis approach accommodates organizations that have customized their repository structure or maintain multiple projects within a single SVN repository. Adjust the path specifications to match your actual directory names and locations.
"Custom repository layouts often hide complexity that only becomes apparent during migration—allocate extra time for troubleshooting."
Handling Large Repositories
Repositories with extensive histories or large binary files present special challenges. The migration process becomes memory-intensive and time-consuming, sometimes exceeding practical limits for single-pass conversion.
For particularly large repositories, consider these strategies:
- 📦 Partial history migration using --revision flags to limit how far back conversion goes
- 📦 Repository splitting to separate independent projects into distinct Git repositories
- 📦 Incremental migration converting in stages rather than all at once
- 📦 Binary file extraction moving large assets to Git LFS or separate storage
- 📦 Server-side conversion using more powerful machines for the migration process
Each approach involves tradeoffs between completeness and practicality. Partial history migration sacrifices older commits for faster conversion. Repository splitting creates cleaner boundaries but requires coordinating multiple repositories. Incremental migration adds complexity but makes progress more manageable.
Converting SVN Branches and Tags
While git-svn converts SVN branches and tags during cloning, they don't automatically become proper Git branches and tags. SVN branches become Git remote branches, and SVN tags become remote branches as well—neither matches Git's native structure for these elements.
Transforming Remote Branches to Local Branches
After migration completes, your repository contains references to SVN branches as remote tracking branches. Converting these to proper local Git branches requires explicit commands:
git branch -r | grep -v 'trunk' | sed 's/origin\///' | while read branch; do git branch "$branch" "refs/remotes/origin/$branch"; doneThis script iterates through remote branches (excluding trunk, which becomes master or main) and creates corresponding local branches. The resulting structure matches typical Git repository organization, making branches immediately usable for continued development.
Creating Proper Git Tags
SVN tags become remote branches during migration, but Git treats tags as immutable references to specific commits. Converting SVN tags to proper Git tags preserves their semantic meaning:
git for-each-ref refs/remotes/origin/tags | sed 's|refs/remotes/origin/tags/||' | while read tag; do git tag "$tag" "refs/remotes/origin/tags/$tag"; git branch -r -d "origin/tags/$tag"; doneThis process creates lightweight Git tags pointing to the same commits as the SVN tag branches, then removes the unnecessary remote branch references. The result is a clean tag structure that follows Git conventions.
"Properly converting tags prevents confusion later when team members expect tags to behave as immutable release markers rather than branches."
Cleaning Up Remote References
After converting branches and tags, remote references to the SVN repository remain in your Git configuration. These references serve no purpose in a pure Git workflow and should be removed:
git remote rm originThis command eliminates the SVN remote reference, completing the transformation from a git-svn hybrid repository to a standard Git repository. You can then add your new Git remote (GitHub, GitLab, Bitbucket, or self-hosted) as the origin.
Optimizing Repository Structure Post-Migration
Successful migration creates a functional Git repository, but optimization ensures it performs well and follows Git best practices. This phase involves restructuring content, handling large files appropriately, and establishing workflows that leverage Git's strengths.
Restructuring Directory Layouts
SVN's centralized model sometimes encourages directory structures that don't translate well to Git's distributed nature. Common issues include monolithic repositories containing multiple independent projects, deeply nested directory hierarchies, or organizational structures based on team divisions rather than logical code boundaries.
| SVN Pattern | Git Best Practice | Reason for Change |
|---|---|---|
| Single repository for all projects | Separate repositories per project | Enables independent versioning and cleaner dependency management |
| Team-based directory structure | Feature-based organization | Improves code discovery and reduces merge conflicts |
| Deep hierarchical nesting | Flatter directory structure | Simplifies navigation and reduces path length issues |
| Mixed source and binary files | Separate binary asset storage | Keeps repository size manageable and improves clone speed |
Restructuring requires careful planning because it changes how developers navigate code. Document the new structure clearly and communicate changes to all team members before pushing the reorganized repository to shared remotes.
Managing Large Binary Files
Git handles text files exceptionally well but struggles with large binary files. SVN repositories often accumulate binary assets like images, compiled libraries, or design files that bloat Git repositories and slow operations.
Git Large File Storage (LFS) provides the solution by storing large files externally while maintaining references in the repository. Implementing LFS after migration involves identifying large files, installing LFS support, and converting existing files:
git lfs install
git lfs track "*.psd"
git lfs track "*.zip"
git add .gitattributesFor files already committed in history, use git-lfs-migrate to rewrite history and convert large files to LFS pointers:
git lfs migrate import --include="*.psd,*.zip" --everything"Addressing large files immediately after migration prevents performance problems that only worsen as the repository grows."
Establishing Git Workflows
SVN's centralized workflow doesn't translate directly to Git's distributed model. Teams need new collaboration patterns that leverage Git's branching capabilities while maintaining code quality and release stability.
Popular Git workflows include:
- 🌿 Git Flow with dedicated branches for features, releases, and hotfixes
- 🌿 GitHub Flow emphasizing continuous deployment from short-lived feature branches
- 🌿 GitLab Flow combining feature branches with environment branches
- 🌿 Trunk-Based Development focusing on small, frequent commits to main branch
- 🌿 Forking Workflow where contributors work in personal forks before submitting changes
Selecting an appropriate workflow depends on team size, release cadence, and deployment practices. Smaller teams often succeed with simpler approaches like GitHub Flow, while larger organizations benefit from the structure provided by Git Flow or GitLab Flow.
Addressing Complex Migration Scenarios
Not all migrations follow straightforward paths. Complex repositories with unusual structures, extensive histories, or special requirements demand advanced techniques and careful problem-solving.
Splitting Monolithic Repositories
Organizations frequently store multiple projects in a single SVN repository for administrative convenience. Migrating to Git presents an opportunity to separate these projects into independent repositories with distinct histories.
The git filter-branch command enables extracting subdirectories while preserving relevant history:
git filter-branch --subdirectory-filter path/to/project -- --allThis command rewrites repository history to include only commits affecting the specified subdirectory, effectively creating a new repository containing just that project's history. Repeat this process for each project you want to extract, working from fresh clones of the migrated repository each time.
Preserving SVN Revision Numbers
Some organizations rely on SVN revision numbers for tracking, compliance, or integration with external systems. While Git uses commit hashes instead of sequential numbers, you can preserve SVN revision information in commit messages.
The git-svn clone process automatically includes git-svn-id metadata in commit messages, maintaining the connection to original SVN revisions. This metadata looks like:
git-svn-id: https://svn.example.com/repository/trunk@12345 uuid-hereFor cleaner commit messages, you can strip this metadata after migration, but doing so permanently removes the connection to SVN revision numbers. Weigh the value of clean history against the utility of preserved revision references before making this decision.
"Maintaining SVN revision references in commit messages provides a safety net during the transition period when teams are still learning Git."
Handling SVN Externals
SVN externals allow repositories to reference code from other repositories, similar to Git submodules but with different behavior. Migrating repositories with externals requires deciding how to handle these dependencies in Git.
Options for converting SVN externals include:
| Approach | Best For | Considerations |
|---|---|---|
| Git Submodules | External code that updates independently | Requires explicit update commands; can confuse new Git users |
| Git Subtree | External code you want to modify locally | Merges external code into repository; increases repository size |
| Package Managers | Third-party libraries and frameworks | Most modern approach; requires build system integration |
| Direct Inclusion | Small, stable external code | Simplest but loses connection to source; complicates updates |
Modern development increasingly favors package managers over repository-based dependency inclusion. Consider whether migration presents an opportunity to modernize dependency management practices rather than simply translating SVN externals to Git equivalents.
Validating Migration Success
Completing the technical migration process doesn't guarantee success. Thorough validation ensures the new Git repository accurately represents the SVN history and functions correctly for continued development.
Comparing Repository Contents
Start validation by comparing the final state of the Git repository with the SVN trunk. Check out both repositories at their latest revisions and use diff tools to verify identical content:
diff -r svn-checkout git-checkoutAny differences require investigation. Legitimate differences include Git-specific files like .gitignore or .gitattributes, but unexpected discrepancies indicate migration problems that need resolution before proceeding.
Verifying Commit History
Beyond content comparison, validate that the commit history migrated correctly. Check that commit counts match expectations, author information appears correctly, and commit messages preserved their original content.
Use Git's log command to examine history details:
git log --all --oneline --graph --decorateThis visualization helps identify issues like missing branches, incorrect tag placement, or broken history connections. Pay special attention to merge commits, ensuring they maintained proper parent relationships during conversion.
"History validation catches subtle problems that might not affect immediate functionality but cause confusion or data loss later."
Testing Repository Operations
Perform common Git operations to ensure the repository functions correctly. Test cloning, branching, merging, and pushing to remote repositories. Verify that LFS files download correctly if you implemented large file handling.
Have team members clone the repository and attempt typical workflows. Real-world usage often reveals issues that don't appear during technical validation, particularly around workflow assumptions or tool integration.
Training Teams on Git Workflows
Technical migration success means nothing if team members can't work effectively with Git. Training and support during the transition period determine whether migration improves or disrupts productivity.
Addressing the Mental Model Shift
SVN and Git require fundamentally different mental models. SVN users think in terms of central repositories, sequential revision numbers, and explicit locking. Git requires understanding distributed repositories, commit hashes, and optimistic merging.
Effective training addresses these conceptual differences before diving into commands. Help team members understand that Git commits are local until pushed, branches are lightweight and disposable, and merge conflicts are normal rather than exceptional.
Essential Git Commands for SVN Users
Create reference materials mapping familiar SVN commands to Git equivalents. While the commands differ, the underlying operations often have clear parallels:
- svn checkout → git clone (creates local repository copy)
- svn update → git pull (fetches and merges remote changes)
- svn commit → git commit + git push (local commit, then remote sync)
- svn status → git status (shows working directory state)
- svn diff → git diff (displays uncommitted changes)
Emphasize that Git separates operations SVN combines. Committing in Git doesn't immediately share changes with teammates—pushing does. This separation enables offline work and experimental commits without affecting others.
Establishing Support Systems
Even with training, team members will encounter situations requiring help. Establish clear support channels and resources:
- 💬 Designated Git experts available for questions
- 💬 Internal documentation covering common scenarios
- 💬 Recorded training sessions for reference
- 💬 Regular office hours for Git questions
- 💬 Shared troubleshooting guides for common issues
Normalize asking questions and making mistakes. Git's power comes with complexity, and everyone needs time to build proficiency.
"The teams that succeed with Git migration are those that invest as much in people as they do in technical execution."
Maintaining Parallel Systems During Transition
Immediate cutover from SVN to Git carries risks, particularly for large teams or critical projects. Running parallel systems during a transition period provides safety while teams build Git proficiency.
Read-Only SVN Access
The simplest parallel approach maintains SVN as read-only for reference while development moves to Git. This allows team members to access historical information in familiar tools while preventing new SVN commits that would create divergence.
Configure SVN repository permissions to allow read operations but deny commits. Communicate clearly that SVN is now archival and all new work happens in Git. Provide documentation showing how to find equivalent information in Git for common SVN queries.
Bidirectional Synchronization
More complex scenarios might require temporary bidirectional synchronization, allowing commits in either system while keeping them synchronized. The git-svn tool supports this through dcommit operations that push Git commits back to SVN.
This approach carries significant risks. Merge conflicts between systems, confused team members committing to the wrong repository, and synchronization failures can create serious problems. Only use bidirectional sync if absolutely necessary, and plan aggressive timelines for ending it.
Phased Migration by Team or Project
Large organizations might migrate incrementally, moving teams or projects to Git while others continue with SVN. This reduces risk and allows learning from early migrations before moving critical systems.
Phased migration requires careful planning around dependencies between projects. Ensure teams using Git can still integrate with SVN-based projects, either through automated synchronization or clearly defined integration points.
Leveraging Git-Specific Features Post-Migration
Once migration completes and teams adapt to basic Git workflows, organizations can leverage features impossible or impractical with SVN. These capabilities often provide the real return on migration investment.
Pull Requests and Code Review
Git's branching model enables pull request workflows where proposed changes undergo review before merging. This practice improves code quality, spreads knowledge across teams, and catches bugs before they reach main branches.
Platforms like GitHub, GitLab, and Bitbucket provide interfaces for creating, reviewing, and discussing pull requests. Establish guidelines for what requires review, how quickly reviews should happen, and what constitutes approval.
Continuous Integration and Deployment
Git's distributed nature integrates naturally with CI/CD pipelines. Every push can trigger automated builds, tests, and deployments, providing rapid feedback and enabling frequent releases.
Popular CI/CD systems like Jenkins, GitHub Actions, GitLab CI, and CircleCI all support Git-based workflows. Configure pipelines to run on pull requests, preventing broken code from merging, and automate deployments from specific branches or tags.
Advanced History Analysis
Git's powerful history tools enable analysis impossible with SVN. The git blame command shows who last modified each line of code. The git bisect command uses binary search to identify commits that introduced bugs. The git log command supports complex queries across repository history.
These tools transform version control from simple change tracking to active development intelligence, helping teams understand code evolution and make informed decisions.
"The real value of Git migration appears not in the first month but over the following year as teams discover capabilities that transform how they work."
Troubleshooting Common Migration Issues
Even well-planned migrations encounter problems. Understanding common issues and their solutions helps teams respond quickly rather than getting stuck or starting over.
Memory and Performance Problems
Large repository migrations sometimes exhaust available memory or run impossibly slowly. Git-svn processes each revision sequentially, and repositories with hundreds of thousands of commits can take days to convert.
Solutions include:
- Increasing available memory on the migration machine
- Using the --revision flag to limit how much history converts
- Splitting large repositories into smaller logical units
- Running migration on more powerful server hardware
- Using specialized migration services for extremely large repositories
For repositories where full history isn't critical, consider whether starting fresh with Git and archiving SVN history separately might be more practical than full migration.
Character Encoding Issues
SVN and Git handle character encodings differently, sometimes causing problems with commit messages or file content containing non-ASCII characters. These issues appear as garbled text or migration failures.
Specify encoding explicitly during migration:
git config --global i18n.commitEncoding utf-8
git config --global i18n.logOutputEncoding utf-8For existing encoding problems in SVN history, you might need to create custom scripts that clean commit messages during migration or accept that some historical messages will display incorrectly.
Incomplete or Corrupted Migrations
Migration processes sometimes fail partway through, leaving incomplete repositories. Never use partially migrated repositories for production work—they contain incomplete history that can cause subtle problems.
If migration fails, investigate the error messages carefully. Common causes include network interruptions, insufficient permissions, or corrupted SVN revisions. Address the underlying issue, delete the incomplete Git repository, and restart migration from the beginning.
Long-Term Git Repository Maintenance
Successful migration is just the beginning. Long-term repository health requires ongoing maintenance and attention to prevent performance degradation or organizational problems.
Regular Repository Cleanup
Git repositories accumulate cruft over time—obsolete branches, unused tags, and unreferenced objects. Regular cleanup maintains performance and clarity:
git fetch --prune
git gc --aggressive
git remote prune originSchedule cleanup tasks quarterly or whenever repository performance noticeably degrades. Automated scripts can handle routine maintenance, but human review ensures important branches aren't accidentally removed.
Managing Repository Growth
Monitor repository size over time. Rapid growth often indicates problems like large binary files being committed or generated files not properly ignored. Address growth issues promptly before they become serious problems.
Tools like git-sizer analyze repositories and identify potential issues:
git-sizer --verboseThis tool highlights large files, deep directory structures, and other characteristics that might cause performance problems.
Evolving Workflows and Practices
Git workflows should evolve as teams grow and projects mature. Regularly review whether current practices still serve team needs or if adjustments would improve productivity.
Gather feedback from team members about workflow pain points. Common issues include overly complex branching strategies, unclear merge policies, or inadequate code review processes. Treat workflow as iterative—make small improvements regularly rather than waiting for major problems.
"The best Git workflows are those that teams actually follow, not the ones that look best in documentation."
Frequently Asked Questions
How long does SVN to Git migration typically take?
Migration duration varies dramatically based on repository size and history depth. Small repositories with a few thousand commits might convert in minutes, while large repositories with hundreds of thousands of commits can take days. The actual conversion time depends on commit count more than repository size, since git-svn processes each revision sequentially. Plan for approximately 1-2 hours per 10,000 commits as a rough estimate, though this varies significantly based on hardware and repository complexity.
Will we lose any history during migration?
Properly executed migrations preserve complete commit history, including all branches, tags, and revision metadata. However, some SVN-specific features don't translate directly to Git. SVN properties, locks, and certain metadata might not convert perfectly. The git-svn tool preserves revision numbers in commit messages, maintaining traceability to the original SVN repository. Test your migration thoroughly and compare the resulting Git repository with SVN to verify all important history transferred correctly.
Can we continue using SVN while testing Git?
Yes, maintaining parallel systems during transition is common and recommended. Keep SVN as the primary repository while teams test Git workflows and build proficiency. The git-svn tool supports bidirectional synchronization, allowing you to pull SVN changes into Git and push Git commits back to SVN. However, long-term parallel operation creates complexity and confusion. Plan for a defined transition period with a clear cutover date rather than indefinite parallel systems.
What happens to our SVN tags and branches?
Git-svn converts SVN branches and tags to Git remote branches during migration. These require additional conversion steps to become proper Git branches and tags. SVN branches become Git branches through simple commands that create local branches from remote references. SVN tags need conversion to proper Git tags since Git treats tags as immutable references rather than branches. Post-migration cleanup scripts handle these conversions, resulting in a repository structure that follows Git conventions.
Do we need to migrate our entire SVN history?
No, you can choose to migrate only recent history if complete history isn't valuable or causes practical problems. Use the --revision flag with git-svn to specify how far back migration should go. Some organizations migrate only the last year or two of history, archiving older SVN history for reference but not converting it to Git. This approach dramatically reduces migration time and resulting repository size, though it means older history remains accessible only through SVN tools.
How do we handle large binary files in our SVN repository?
Large binary files require special handling since Git performs poorly with large binaries in repository history. Implement Git LFS (Large File Storage) to store large files externally while maintaining references in the repository. For files already in SVN history, use git-lfs-migrate to rewrite history and convert large files to LFS pointers. Alternatively, consider whether large binaries belong in version control at all—artifact repositories or cloud storage might be more appropriate for compiled binaries, media assets, or other large files.