Creating Archives
How to build blob archives from directories for storage in OCI registries.
Using CreateBlob (Recommended)
For file-based archives, CreateBlob is the simplest approach. It creates the archive files and returns an open BlobFile ready for use:
import (
"context"
"github.com/meigma/blob"
)
func createArchive(srcDir, destDir string) (*blob.BlobFile, error) {
return blob.CreateBlob(context.Background(), srcDir, destDir,
blob.CreateBlobWithCompression(blob.CompressionZstd),
)
}
This creates index.blob and data.blob in destDir and returns an open archive. Remember to close it when done:
blobFile, err := blob.CreateBlob(ctx, srcDir, destDir)
if err != nil {
return err
}
defer blobFile.Close()
// Use the archive immediately
content, err := blobFile.ReadFile("config.json")
Custom Filenames
Override the default filenames with options:
blobFile, err := blob.CreateBlob(ctx, srcDir, destDir,
blob.CreateBlobWithIndexName("my-archive.idx"),
blob.CreateBlobWithDataName("my-archive.dat"),
)
Saving an Existing Blob
To save an in-memory or remote Blob to local files:
// archive is a *blob.Blob from any source
err := archive.Save("/path/to/index.blob", "/path/to/data.blob")
Using Create (Advanced)
The lower-level Create function provides more control when you need to:
- Write to non-file destinations (network streams, cloud storage)
- Handle index and data separately
- Integrate with custom I/O pipelines
Basic Usage
To create an archive, provide a source directory and writers for the index and data:
import (
"context"
"os"
"github.com/meigma/blob"
)
func createArchive(srcDir string) error {
indexFile, err := os.Create("archive.index")
if err != nil {
return err
}
defer indexFile.Close()
dataFile, err := os.Create("archive.data")
if err != nil {
return err
}
defer dataFile.Close()
return blob.Create(context.Background(), srcDir, indexFile, dataFile)
}
The function walks the source directory recursively, writing file contents to the data writer and metadata to the index writer. Files are written in path-sorted order to enable efficient directory fetches.
Compression
To enable zstd compression, use CreateWithCompression:
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithCompression(blob.CompressionZstd),
)
Compression reduces data size but requires decompression when reading. For typical source code and configuration files, expect 2-4x compression ratios.
Available compression options:
blob.CompressionNone- Store files uncompressed (default)blob.CompressionZstd- Use zstd compression
Skipping Compression
Some files compress poorly because they are already compressed (images, videos, archives) or too small to benefit. Use CreateWithSkipCompression to skip these:
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithCompression(blob.CompressionZstd),
blob.CreateWithSkipCompression(blob.DefaultSkipCompression(1024)),
)
DefaultSkipCompression(minSize) creates a predicate that skips:
- Files smaller than
minSizebytes - Files with known compressed extensions (
.jpg,.png,.zip,.gz, etc.)
Custom Skip Predicates
To define custom skip logic, pass additional predicates:
// Skip lock files and generated code
skipGenerated := func(path string, info fs.FileInfo) bool {
return strings.HasSuffix(path, ".lock") ||
strings.Contains(path, "/generated/")
}
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithCompression(blob.CompressionZstd),
blob.CreateWithSkipCompression(
blob.DefaultSkipCompression(1024),
skipGenerated,
),
)
If any predicate returns true, the file is stored uncompressed.
Change Detection
For build pipelines, enable strict change detection to catch files that change during archive creation:
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithChangeDetection(blob.ChangeDetectionStrict),
)
With strict change detection, Create verifies that file size and modification time remain unchanged after reading. If a file changes mid-write, Create returns an error rather than producing an archive with inconsistent content.
Change detection modes:
blob.ChangeDetectionNone- No verification (default, fewer syscalls)blob.ChangeDetectionStrict- Verify files did not change during creation
File Limits
To protect against runaway archive creation, limit the number of files:
// Allow up to 50,000 files
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithMaxFiles(50000),
)
If the source directory contains more files than the limit, Create returns blob.ErrTooManyFiles.
Special values:
0- Use default limit (200,000 files)- Negative values - No limit
Memory Considerations
Create builds the entire index in memory before writing. Memory usage scales with the number of files and average path length.
Rough guide:
- 10,000 files: ~3-5 MB
- 100,000 files: ~30-50 MB
- 200,000 files: ~60-100 MB
For archives approaching the default 200,000 file limit, ensure the build environment has sufficient memory (256 MB+ recommended).
Cancellation
Pass a context to support cancellation of long-running archive creation:
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()
err := blob.Create(ctx, srcDir, indexW, dataW,
blob.CreateWithCompression(blob.CompressionZstd),
)
if errors.Is(err, context.DeadlineExceeded) {
// Archive creation timed out
}
Complete Examples
Using CreateBlob
A production archive creation function with CreateBlob:
func createProductionArchive(srcDir, destDir string) (*blob.BlobFile, error) {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()
return blob.CreateBlob(ctx, srcDir, destDir,
blob.CreateBlobWithCompression(blob.CompressionZstd),
blob.CreateBlobWithSkipCompression(blob.DefaultSkipCompression(1024)),
blob.CreateBlobWithChangeDetection(blob.ChangeDetectionStrict),
blob.CreateBlobWithMaxFiles(100000),
)
}
Using Create
A production archive creation function with the lower-level Create API:
func createProductionArchive(srcDir, indexPath, dataPath string) error {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()
indexFile, err := os.Create(indexPath)
if err != nil {
return fmt.Errorf("create index file: %w", err)
}
defer indexFile.Close()
dataFile, err := os.Create(dataPath)
if err != nil {
return fmt.Errorf("create data file: %w", err)
}
defer dataFile.Close()
err = blob.Create(ctx, srcDir, indexFile, dataFile,
blob.CreateWithCompression(blob.CompressionZstd),
blob.CreateWithSkipCompression(blob.DefaultSkipCompression(1024)),
blob.CreateWithChangeDetection(blob.ChangeDetectionStrict),
blob.CreateWithMaxFiles(100000),
)
if err != nil {
return fmt.Errorf("create archive: %w", err)
}
return nil
}
See Also
- Architecture - How the archive format works
- Integrity - How content verification works