In the world of software development and system administration, managing file versions is a critical but often overlooked task. Whether you’re dealing with configuration files, documents, or any other type of file that changes over time, having a reliable versioning system can save you from countless headaches. Today, I’m excited to introduce py-file-versioning, a powerful and flexible file versioning system written in Python.

What is py-file-versioning?

py-file-versioning is a Python library that provides a straightforward yet powerful approach to file versioning. It’s designed to be both simple enough for basic use cases and flexible enough for more complex scenarios. The library is available on PyPI and can be easily installed using pip.

Key Features

1. Automatic Version Management

The library automatically handles version numbering and organization. Each version is stored with a timestamp and sequence number, ensuring unique identification of every file version. You don’t need to worry about naming conventions or version conflicts - the system handles it all for you.

2. Flexible Compression Options

One standout feature is the built-in support for multiple compression formats:

  • gzip
  • bzip2
  • xz
  • uncompressed (if you prefer raw files)

This flexibility allows you to balance storage space against access speed based on your specific needs.

3. Configurable Timestamps

The system supports both UTC and local time timestamps, and you can choose whether to use the file’s modification time or the current time when creating versions. This level of control is particularly useful when maintaining audit trails or when working across different time zones.

4. Version Management

You can set a maximum number of versions to keep per file, and the system will automatically handle cleanup of older versions. This prevents unchecked growth of your version directory while maintaining your most recent history.

Real-World Usage Examples

Let’s look at some practical examples of how you might use py-file-versioning.

Command Line Usage

# Create a version of a configuration file
$ pyfileversioning create config.ini -d backups -c xz
Created version: backups/config--20240120.143156_001.ini.xz

# List all versions
$ pyfileversioning list config.ini -d backups
Versions for config.ini:
------------------------------------------------------------
config--20240120.143156_001.ini.xz   285 bytes  2024-01-20 14:31:56

# Restore a specific version
$ pyfileversioning restore backups/config--20240120.143156_001.ini.xz --target config.restored.ini

Python API Usage

from py_file_versioning import FileVersioning, FileVersioningConfig, CompressionType

# Create a versioning instance with custom configuration
config = FileVersioningConfig(
    versioned_path="backups",
    compression=CompressionType.XZ,
    max_count=5  # Keep only last 5 versions
)
versioning = FileVersioning(config)

# Create a new version
version_path = versioning.create_version("myfile.txt")

# List all versions
versions = versioning.list_versions("myfile.txt")
for version in versions:
    print(f"{version.filename}: {version.size} bytes, created at {version.timestamp}")

Use Cases

py-file-versioning is particularly useful in several scenarios:

  1. Configuration Management

    • Keep track of changes to configuration files
    • Easily roll back to previous configurations if needed
    • Maintain an audit trail of configuration changes
  2. Development and Testing

    • Version test data files
    • Maintain different versions of input/output files
    • Track changes to documentation or specification files
  3. System Administration

    • Backup and version critical system files
    • Maintain history of log file snapshots
    • Version cron job or script files
  4. Document Management

    • Keep versions of important documents
    • Track changes to text-based files
    • Maintain backup copies with compression

Design Philosophy

The library is built with several key principles in mind:

  1. Simplicity: The API is straightforward and intuitive, making it easy to integrate into existing workflows.
  2. Flexibility: Extensive configuration options allow you to tailor the behavior to your needs.
  3. Reliability: Robust error handling and clear feedback ensure dependable operation.
  4. Performance: Optional compression and automatic cleanup help manage storage efficiently.

File Naming Convention

One of the thoughtful aspects of py-file-versioning is its clear and logical file naming convention:

{original_name}--{timestamp}_{sequence}{extension}[.compression_ext]

For example: config--20240120.143156_001.ini.gz

This naming scheme ensures:

  • Original filename is preserved
  • Timestamps are human-readable
  • Sequence numbers prevent conflicts
  • Compression type is clearly indicated

Getting Started

To start using py-file-versioning, simply install it using pip:

pip install py-file-versioning

The library works on all major platforms and requires no external dependencies beyond the Python standard library.

py-file-versioning vs Git: When to Use Each

While both py-file-versioning and Git are version control systems, they serve different purposes and excel in different scenarios. Here’s when to consider each:

When to Use py-file-versioning

  1. Single File Focus

    • When you need to version individual files independently
    • For files that change frequently but don’t need commit messages or branching
    • When you want automated versioning without manual intervention
  2. System Configuration

    • Managing server configuration files
    • Maintaining backup copies of critical system files
    • When you need automatic compression of versioned files
  3. Non-Developer Usage

    • When users aren’t familiar with Git commands
    • For simple version management without branching complexity
    • When you need a straightforward API for integration into other tools
  4. Automated Systems

    • For automated backup systems
    • When integrating file versioning into Python applications
    • When you need programmatic version management

When to Use Git

  1. Source Code Management

    • For managing entire codebases
    • When you need branching and merging capabilities
    • For collaborative development with multiple contributors
  2. Project History

    • When you need detailed commit messages
    • For tracking changes across multiple files simultaneously
    • When you need to understand why changes were made
  3. Collaboration

    • For team-based development
    • When you need pull requests and code review
    • For managing concurrent changes from multiple developers
  4. Feature Development

    • When you need feature branches
    • For managing different versions of your entire project
    • When you need to maintain multiple release versions

Complementary Use

In many cases, you might use both tools together:

  • Use Git for your source code
  • Use py-file-versioning for managing configuration files, logs, or data files
  • Git for project-wide version control, py-file-versioning for specific file management needs

The key difference is that py-file-versioning is designed for simpler, file-centric version management with features like automatic compression and cleanup, while Git is a full-featured version control system designed for managing entire projects and facilitating collaboration.

Conclusion

py-file-versioning fills an important niche in the Python ecosystem by providing a simple yet powerful solution for file versioning. Whether you’re managing configuration files, maintaining document versions, or need a reliable way to track file changes over time, this library offers a robust solution that’s easy to integrate into your workflow.

The combination of a clean API, flexible configuration options, and built-in compression support makes it a valuable tool for developers, system administrators, and anyone else who needs to maintain file versions in a systematic way.

Check out the project on GitHub to learn more, contribute, or provide feedback!