Tuesday, April 4, 2023

Comparison of par2cmdline, par2cmdline-turbo, klauspost/reedsolomon, and catid/wirehair

Programs tested: parchive/par2cmdline, klauspost/reedsolomon, catid/wirehair

=== 1 GB file with 1100 shards ===

1000 file shards, 100 recovery shards

par2cmdline: 30 seconds, 0.1 GB memory usage

par2cmdline-turbo: 5 seconds, 0.12 GB memory usage

wirehair: 4 seconds, 2.1 GB memory usage

reedsolomon: 4 seconds, 1.5 GB memory usage

The results of the test above show that par2cmdline-turbo is the clear winner in terms of performance and memory usage.

The memory usage of wirehair is due to the storage of the file in memory as well as the creation of the encoder. The 1GB file itself takes up 1GB of memory, and then the encoder also takes up another 1GB, for a total of 2GB. 

The memory usage of Klaus Post's reed solomon implementation is due to the file itself as well as the encoder. The reed solomon encoding matrix takes around 0.1GB, then the file itself takes around 1GB, and then encoding the shards takes around 0.4GB, for a total of 1.5GB.

All in all, wirehair and reedsolomon take similar amounts of time, though reedsolomon uses somewhat less memory. Both are much faster than par2cmdline but also use much more memory.

Note: klauspost/reedsolomon does support progressive encoding, but only for the Regular code (which supports up to 256 shards) not the Leopard-RS code (which supports up to 65536 shards). This means that if you have more than 256 shards then you cannot do progressive encoding. See source code here (https://github.com/klauspost/reedsolomon/blob/4e9455c045bba7f15065ef2f216d488866decf2b/leopard.go):

func (r *leopardFF16) EncodeIdx(dataShard []byte, idx int, parity [][]byte) error {
return ErrNotSupported
}

As for wirehair, the author said that there is no way to do progressive encoding:

https://github.com/catid/wirehair/issues/30

Q: Is there any way to not have to load entire files into ram?

catid: I think you'd have to use swap for that or split it into separate pieces. The row/column mixing travels all over the dataset. Might be improved a bit but at the end of the day requires everything to produce the intermediate symbols and then produce output!

=== 1 GB file with 220 shards ===

 
200 file shards, 20 recovery shards
 
par2cmdline: 14 seconds, 0.12 GB
klauspost/reedsolomon: 10 seconds, 0.12 GB

par2cmdline-turbo: 5 seconds, 0.15 GB

Here I used progressive encoding so it used less memory

In this scenario it seems that par2cmdline is only a bit slower than reedsolomon while using the same amount of RAM.

It seems that while in progressive encoding mode reedsolomon uses the same amount of RAM as par2cmdline, it also is much slower, to the point where it is not much faster than par2cmdline.

Also wow, par2cmdline-turbo is so much faster than par2cmdline it's crazy! It's like 3 times faster! It does use slightly more RAM though it seems, but not by much.

I also ran par2cmdline-turbo with a range of file sizes and found that the running time grows faster than linearly:

100 shards: 3 seconds

200 shards: 3.2 seconds

1,000 shards: 4 seconds

1,500 shards: 5 seconds

2,000 shards: 7 seconds

5,000 shards: 24 seconds

10,000 shards: 85 seconds

20,000 shards: 179 seconds

Here's the code I used for testing Klaus Post's reedsolomon progressive encoding, it is basically just a modified version of the simple-encoder.go in the examples directory (you can run it by replacing the contents of the simple-encoder.go file with the below):

//go:build ignore
// +build ignore

// Copyright 2015, Klaus Post, see LICENSE for details.
//
// Simple encoder example
//
// The encoder encodes a single file into a number of shards
// To reverse the process see "simpledecoder.go"
//
// To build an executable use:
//
// go build simple-decoder.go
//
// Simple Encoder/Decoder Shortcomings:
// * If the file size of the input isn't divisible by the number of data shards
//   the output will contain extra zeroes
//
// * If the shard numbers isn't the same for the decoder as in the
//   encoder, invalid output will be generated.
//
// * If values have changed in a shard, it cannot be reconstructed.
//
// * If two shards have been swapped, reconstruction will always fail.
//   You need to supply the shards in the same order as they were given to you.
//
// The solution for this is to save a metadata file containing:
//
// * File size.
// * The number of data/parity shards.
// * HASH of each shard.
// * Order of the shards.
//
// If you save these properties, you should be able to detect file corruption
// in a shard and be able to reconstruct your data if you have the needed number of shards left.

package main

import (
	"bufio"
	"flag"
	"fmt"
	"io/ioutil"
	"os"
	"path/filepath"
	"time"

	"github.com/klauspost/reedsolomon"
)

var dataShards = flag.Int("data", 200, "Number of shards to split the data into, must be below 257.")
var parShards = flag.Int("par", 20, "Number of parity shards")
var outDir = flag.String("out", "", "Alternative output directory")

func init() {
	flag.Usage = func() {
		fmt.Fprintf(os.Stderr, "Usage of %s:\n", os.Args[0])
		fmt.Fprintf(os.Stderr, "  simple-encoder [-flags] filename.ext\n\n")
		fmt.Fprintf(os.Stderr, "Valid flags:\n")
		flag.PrintDefaults()
	}
}

func main() {
	start := time.Now()
	// Parse command line parameters.
	flag.Parse()
	args := flag.Args()
	if len(args) != 1 {
		fmt.Fprintf(os.Stderr, "Error: No input filename given\n")
		flag.Usage()
		os.Exit(1)
	}
	fname := args[0]

	// Create encoding matrix.
	enc, err := reedsolomon.New(*dataShards, *parShards)
	checkErr(err)
	fmt.Println("Finished creating reedsolomon encoding matrix.")

	// Open the file for reading
	f, err := os.Open(fname)
	checkErr(err)
	defer f.Close()

	dir, file := filepath.Split(fname)
	if *outDir != "" {
		dir = *outDir
	}
	// calculate the shard size
	fi, err := f.Stat()
	if err != nil {
		// Could not obtain stat, handle error
	}
	fmt.Printf("The file is %d bytes long\n", fi.Size())

	filesize := fi.Size()
	if filesize%200 != 0 {
		panic("file size must be a multiple of 200 bytes")
	}
	shardsize := filesize / 200 // 200 file shards
	println("shard size:", shardsize)

	// create the parity shard buffer
	parity := make([][]byte, 20) // 20 recovery shards
	for i := range parity {
		parity[i] = make([]byte, shardsize)
	}

	// create the parity shards
	buf := make([]byte, shardsize)
	index := 0
	r := bufio.NewReader(f)
	elapsed := time.Since(start)
	fmt.Println("time elapsed until starting to encode file:", elapsed)
	for {
		start := time.Now()

		// read in a chunk
		_, read_err := r.Read(buf)
		if read_err != nil { // assume EOF
			println("all done!")
			break
		}
		// write it out into a file
		outfn := fmt.Sprintf("%s.%d", file, index)
		err = os.WriteFile(filepath.Join(dir, outfn), buf, 0644)
		checkErr(err)

		elapsed := time.Since(start)
		fmt.Println("reading + writing time:", elapsed)
		start = time.Now()

		// encode the chunk
		err = enc.EncodeIdx(buf, index, parity)
		checkErr(err)
		index++

		elapsed = time.Since(start)
		fmt.Println("encoding time for shard", index, ":", elapsed)
	}
	start = time.Now()

	// Write out parity shards into files.
	for i, shard := range parity {
		outfn := fmt.Sprintf("%s.%d", file, 200+i)

		fmt.Println("Writing to", outfn)
		err = ioutil.WriteFile(filepath.Join(dir, outfn), shard, 0644)
		checkErr(err)
	}

	elapsed = time.Since(start)
	fmt.Println("time taken to write out parity files:", elapsed)

}

func checkErr(err error) {
	if err != nil {
		fmt.Fprintf(os.Stderr, "Error: %s", err.Error())
		os.Exit(2)
	}
}

Output:

reedsolomon/examples$ time go run simple-encoder.go ~/Downloads/1GB.zip 
Finished creating reedsolomon encoding matrix.
The file is 1073741800 bytes long
shard size: 5368709
time elapsed until starting to encode file: 65.674005ms
reading + writing time: 8.242166ms
encoding time for shard 1 : 133.489673ms
reading + writing time: 8.663781ms
encoding time for shard 2 : 32.475082ms
reading + writing time: 11.068881ms
encoding time for shard 3 : 39.272467ms
reading + writing time: 15.424516ms
encoding time for shard 4 : 33.573678ms
reading + writing time: 19.12296ms
encoding time for shard 5 : 39.149728ms
reading + writing time: 18.893182ms
encoding time for shard 6 : 38.416607ms
reading + writing time: 10.466169ms
... lines omitted for brevity ...
encoding time for shard 195 : 39.498974ms
reading + writing time: 11.668383ms
encoding time for shard 196 : 35.514783ms
reading + writing time: 17.014156ms
encoding time for shard 197 : 42.856191ms
reading + writing time: 12.128048ms
encoding time for shard 198 : 47.070398ms
reading + writing time: 7.328568ms
encoding time for shard 199 : 39.252697ms
reading + writing time: 8.452414ms
encoding time for shard 200 : 42.781552ms
all done!
time taken to write out parity files: 133.70886ms
total time spent on encoding: 8.088607059s

real    0m11.404s
user    0m7.337s
sys     0m2.276s

No comments:

Post a Comment