by driverdude » Tue Jul 03, 2012 6:55 am
Though I wrote tools to do Par2 and rar for my own needs, and distrust output of ANY program for usenet that seem to overly rely on Yenc and coverup issues at times, I do have a knee jerk opinon on PAr3 after looking at its spec very briefly :
PAR3 IS A PILE OF CRAP!
Its faster in theory (huge files) because its DEFECTIVE and dangerous. It was not thought out well.
A couple reasons proposed Par3 is worse than GENUINE and only legitimate Par2 (my thoughts) :
1> You can safely scan a hard drive of millions of files and sort them into par2 sets first by 16K md5 checksum (billions multiplied by billions of times more unique than crappy 32 bit CRC3 from the 1980s) because MD5 initial fingerprint in Par2 is 256 bits not 32bits like crappy Par3. This allows you to distrust filename in OS (filenames typically mangled if Japanese or Russian, sometimes french too messed up like an accented version of "Sènégal.mp3") and if file is slightly damaged, par2 set will match up because you can ignore the EXPECTED filelength too. Only crappy standards such as SFV uses crc32. Antoher problem with CRC-32 standard is it allows two files containing a datastream of pure nulls, to yield the SAME crc-32 even if one file has a thousand or a million more null characters in the stream than the smaller file.... because the CRC-32 standard does not salt or force programers to add some 111111's or invert initial seed buffer for speed. Admittedly , and humorously many crc-32 libraries invert the initial buffer so a stream of 0's will be unique, but that has been a pet peeve of mine for decades, even though I always initialized and seeded with 1's. This Par3 spec specifies CRC-32-IEEE802.3, which by law means not to EOR (exclusive OR) or invert but rather to properly seed with 32 1's and "unreflected". At least THAT part is not retarded, and XOR with 0xffffffff constantly in EVERY call, in case programmers were newbies, is avoided for ultimate crc32 speed.
2> Par2 HAS to have a universal ASCII approximation file name MANDATORY unique per set, and an OPTION full blown UNICODE-16 support. Crappy Par3 proposal has only UTF-8, though both UTF-8 and UTF-16 can of course hold within them UTF-32 and UTF-16. UTF-8 is risky because most UTF-8 libraries do not do filename matches on decomposed tokens (accents separate from letter e for example, and various stroke marks separate from root japanese). Mac OSX for spite decomposes Unicode a lot jsut to keep programmers on their toes. I wrote my own decomposition library tools years ago, for string compares, but the BEST is to use ASCII FILENAME APPROXIMATION and force programmers to use it and that is what PAr2 does as one of TWO simultaneous filenames. Par3 dangerously insists solely on buggy UTF-8 because it is a spec written by a Japanese guy angry at having to deal with ROMAN usage of words in programming world. Worse... becaue he does not understand dangers of sorting decomposed unicode, he actually insists file numbers be assigned based on file name SORT order and allows UNICODE to participate. This means running areference test benchmark on two OSes (Mac OSX vs Windows 7( could result in two different sets of par3 results that would have been less likely if files were not sorted based on RAW UNICODE BIT STREAM!!! (Unicode allows OPTIONAL BOM markers at start and middle of streams, though UTF-8 has no need of BOM markers, it has many other pesky things. U+00E9 is equal to U+0065 followed by U+0301, but sorting in THIS crappy spec would make a missort, without knowing which way an OS decided to decompose or not. Its not a MAc problem, in this case its LINUX issue, Mac is always 100% pure and decomposed unicode for 10 years but linux is arbitrary and almost random.
3> The maximum file size storable in Par2 spec is TWICE the size of PAR3, hardly a step forward. Yup. Par2 uses UNSIGNED 64 bit lengths for all vital fields such as filelengths.
4> FFT in Par3 made grave shortcuts for Bluray rip archivers of public domain bluray copies of their own wedding videos. Long explaination : FFT of 2's powers can be done with GPUS and GPU libraries (Free in Mac OS via Apple for NVIDIA and AMD(ATI), fallback OPenCL instead of from GPU if needed : http://en.wikipedia.org/wiki/OpenCL), and free in ARM in iPad and iPhone (OpenCL , + vDSP API , among others), and Intel corp of course has CPU assisted vector code for FFT on Windows, and 3rd party FFT on NVidia and AMD. Also in Windows you can make a joke out of traditional GPU or vector CPU FFT assists and get 106 GIGAFLOPS :http://www.bealto.com/gpu-fft_intro.html with NVidia CUFFT . FFT is used in Par3 and Par2 because of POSSBILE hardware acceleration even if no programmers do it yet in most Par2 implementations, but the sad part is Par3 makes more shortcuts sacrificing the future, for options of assiting FFT. One laziness example : Par2 allows 65534 repair slices of data so that if you want to you could take a 45 gigabyte file, break it up conceptually, and is 25% pars desired, then 65536 108K (108 kiloBYTES sized) repair parity records can be provided in Par2 to go with the precious problematic 45 gigabyte source data (also broken up conceptually in 32K pieces usually). In Par3 you are only allowed HALF the number of repair slices, another step backwards!
5 > In Par2 a recovery set ID is critical for file parings and is 128 bit MD5.In Par2 it is some sort of perverted joke that it is a paltry 32 bit CRC-32! Not kidding! Par3 files are NEVER supposed to be mixed with other par3 files on the same computer directory ! Ha!
6> In Par2 EVERY packet , every record, nearly every field is protected for safety by many 128 bit MD5 checksums, in Par3 it is billions times billions of times less secure! (Hilarious CRC-64). Yes I always check every on in my par2 code.
7> in Par2 a misanamed par2 header file with no file extension such as mywedding_par2 instead of mywedding.par2, can be identified rapidly by OS tools by a special 16 byte header at the top of the file. Par3 arrogantly has NO VERSION number of ANY KIND in the par3 even though par2 has a 2.0 in it that could have been changed, and par3 has.... I m not joking... only 6 bytes in file front as an identifier. 6 bytes a person could accidentally type in a text file header in amessage to a buddy, because historically though NULL is rare in a text file, it is not illegal and is a "NOP" on lineprinters and teletypewriters. PAr2 planeed on safe 16 byte identifiers not insane 6 byte file headers for critical main vital par file. (true, both allow multiple copies in multiple files, but what a pain).
Half here, half there , half half half! Everything about Par3 is cutting Par2 quality in half for THEORETICAL speed gains at expense of critical accuracy and quality. True..... BOTH allow full file MD5 of files up to 63 bits in size, (Par2 allows double)... and thats good enough for ULTIMATE final safety. But too many intermediate steps are skimped on buy this half baked 2001 standard of Par3.
I admit i see a FEW advantages in Par3 that could have been added to Par2, but the LIES and laughable clinging at straws in the spec is stomach churning. For example, in cyptography you don't rely on MD5 alone for digital signatures int eh last 5 years because tools exist to create a SPECIAL datatastring that has special padding at end of TWO filestreams (not ONE), to specially make two files have the same MD5, just for funny proof on concept, so long as you can control the padding of the end of the datastream.... This author uses that to beat his chest and claim his speci is BETTER than par2 slightly because he offers a ridiculous 1980s crc-32 checksum in addition to MD5 full stream that both par2 and par3 offer. He types this claim as "it may be more difficult to create a fake file with same MD5 and checksums". Hah. The effort to pad the datastream with with output from a crc32 spoofer and a MD5 is less trivial and therefore "harder" but hardly cryptographically secure! Its the almost the same endeavor overall.
I apologize for all the typos... And I am NOT a Par2 authority, only a one time implementor of sorts. But I am not wrong, Its just that it is afact that this 2001 Par3 originated spec is a NOT AN IMPROVEMENT over Par2 and should be IGNORED and killed off.
I beg the developers of Newsbin to resist supporting PAr3 only to handle the error DETECTION aspect of it (it is easy to support if for error detection, in fact is is backward compatible slightly for that)0, but 100% ban this abomination. It should also warn people to stay away from adopting crappy so-called Par3.
Par3 is a crappy abomination.
Par3 is a travesty and a pack of dangerous lies. I am not a luddite, I do not resist change or technology, its just that Par3 is a crappy spec compared to par2.
=driverDude