The official XCOMP/2 home page

The XCOMP/2 home page

What is XCOMP/2

Welcome to the home of XCOMP/2, a recursive file compare utility. XCOMP/2 is for comparing files what is XCOPY is for copying files, something I was definitely missing in the OS/2 product. A port for WIN32 is also available (for those who are sometimes forced to use Windows, but you need OS/2 to get access to that port!).

You may also want to take a look into the ISOCOMP/2 tool, which lets you compare a CD-ROM's contents with the ISO image (or RSJ track image) file it has been burnt with, or you may create an ISO image out of a CD-ROM.

When invoking XCOMP.EXE without commandline arguments, the help is displayed:

XCOMP/2 - The recursive file compare utility for OS/2, V3.10 (C) Roman Stangl 05, 2002 (Roman_Stangl@at.ibm.com) http://geocities.datacellar.net/SiliconValley/Pines/7885/ Use the XCOMP command to selectively compare groups of files located in two directories, including all subdirectories. Syntax: XCOMP [drive:\|\\server\][path\]filename(s) [[drive:\|\\server\]path] [/!MP] [/LOG[:[drive:\|\\server\][path\]file]] [/!C] [/!F] [/!S] [/!B] [/P] [/TINY] [/LINE] [/SIMULATE] [/CHK:[drive:\|\\server\][path\]file] Where: [drive:\|\\server\][path\]filename(s) Specifies the location and name of the Source file(s) to compare from. You may specify a fully qualified path or UNC path. [[drive:\|\\server\]path] Specifies the location of the Target path to compare with. You may specify a fully qualified path or UNC path. [/MP] Specifies that 1 thread reads the source and 1 thread reads the target file. This improves througput when comparing from 2 different physical drives (e.g. CD-ROM and Harddisk). [/LOG[:[drive:\|\\server\][path\]file]] Specifies that XCOMP/2 logs all problems into a file specified either by this parameter, or by the XCOMP environment variable or into XCOMP.LOG (put into the directory XCOMP/2 was installed into) otherwise. [/!C] By default, XCOMP/2 pauses at all mismatches. Specifying this option allows XCOMP/2 just display the mismatch and continue the comparison without a pause (e.g. useful when using the /LOG option or output redirection). [/!F] By default, XCOMP/2 pauses for files in the source location that can't be found at the target location. Specifying this this option allows XCOMP/2 just display the miss and continue the comparison without a pause (e.g. useful when using the /LOG option or output redirection). [/!S] By default, XCOMP/2 recurses into all subdirectories it finds, specifying this option prevents XCOMP/2 doing that. [/!B] By default, XCOMP/2 will beep, when a severe error occurs accessing a file at the source or target location. Specifying this option will silence XCOMP/2 (e.g. useful when using the /LOG option or output redirection). [/P] Request XCOMP/2 to pause when it has finished. [/TINY] 2 64kB buffers are used instead of a percentage of total RAM. [/LINE] Display line number information for messages (useful for e.g. debugging) [/SIMULATE] Does not compare the files (useful for e.g. just to list what files would be compared by checking their existance) [/CHK:[drive:\|\\server\][path\]file] Specifies that XCOMP/2 uses a checksum file to ensure data integrity. If the checksum file does not exist, it will be created, otherwise compared with the checksum calculated from the data read from the Source. When using the extension ".MD5" is used, the checksum file will be compatible to the MD5SUM utility. You may need the option /!S additionally, as MD5SUM ignores subdirectories. Returns: 0 Successful completion 1 Files could not be opened to compare (possibly 0-length, locked or not existing) 2 Directories could not be opened to search for files (possibly access right or file system problems) 3 Directories could not be opened to search for directories possibly access right or file system problems) 4 A mismatch between at least 1 file was detected 5 A mismatch between the calculated Checksum and the recorded one in the Checksum file of at least 1 file was detected 100+ Fatal, unrecoverable exceptions XCOMP101: Too few commandline arguments specified.

Well, above explanation should be self explanatory. There are basically 2 variations for the commandline:

A sample run may output:

[0: H:\programming\xcomp]xcomp p:\notes\archivepsk* h:\notes\data\archives Comparing files qualified by archivepsk* at Source path P:\notes\ with Target path H:\notes\data\archives\ using 4194304 bytes buffer size ArchivePSK1998.nsf ArchivePSK1999.nsf ArchivePSK2000.nsf XCOMP001: Throughput Source 556kB/s, Destination 46247kB/s, Total 1064kB/s XCOMP007: 3 file(s) compared successfully, 3 file(s)compared totally.

One note about the line containing the Throughput measurements. In that example I was comparing some Lotus Notes databases on a backup CD-RW with the orignal on a harddisk. You can see that the Source drive, a CD-ROM drive, delivers a not too high throughput (it seems that that this 8-16 speed CD-ROM drive isn't too comfortable reading CD-RWs), while the Target drive, a Ultra-Wide SCSI drive delivers excellent performance (ok, actually it's the HPFS386 cache ;-) giving an acceptable Total performance.

The performance given for Source and Destination is just the raw performance of the OS/2 DosRead() API without accounting for overhead like doing the file searched and the comparison itself. The Total performance does include all overhead, that is the timer is started when the first file is searched and stopped when all files have been compared. Thus, don't take the performance specified too serious!

If the Source and Destination drives are different physical drives you may reduce the run-time by specifying the option /MP. Using that option causes 2 threads to be used for reading the Source and Destination files simultaneously instead of doing that sequentially with a single thread.
This option is most useful when reading from 2 drives of similar speed e.g. a CD-ROM and a CD-RW drive (because the 2 slow accesses will be done in parallel instead of sequential), it's less useful when having drives with a great difference in speed (because reading the fast drive takes almost no time compared to reading the slow drive, thus the slow drive alone affects the overall performance).

Use the option /!C if you do not want the comparison to stop waiting for the user pressing a key for every miscompare. This option may only be useful when you use the /LOG option or redirect the output into a file, as otherwise the lines stating the miscompare scroll away too fast.

Use the option /!F if you do not want the comparison to stop whenever a file in the source location can not be found in the target location. This option may only be useful when you use the /LOG option or redirect the output into a file, as otherwise the lines stating the miscompare scroll away too fast.

You may also notice that you may have a different buffer size. This is normal, because XCOMP/2 takes the amount of physical memory into account to be efficient on one hand and to limit swapping on the other hand. Using the option /TINY XCOMP/2 will use 2 64kB buffers instead of a percentage of the RAM installed. This might be useful if you are running a memory constraint system.

Finally, to ensure that you can notice the progress, a percentage indicator will be displayed while comparing files that are larger than the given buffer size.

If you just want to see if all files exists without actually comparing them (thus saving much time for the price of possibly overlooking corrupt data), then invoke XCOMP/2 with the /SIMULATE option.

With the /CHK option you can specify the fully qualified path of the Checksum file containing the CRC32 and MD5 checksums for all Source files found. If the Checksum file does not exist already, it will be created, otherwise the Checksums within that file recoreded earlier will be compared to what has been calculated for the Source files just now. The option /!C also affects problems detected for the /CHK option.

Warning! It is expected that the files in the Checksum file do correspond to the qualified source files, that is, XCOMP/2 will not try to synchronize if one or more files have been added or deleted! I thus recommend to use the Checksum file only against files that are static, e.g. you use XCOMP/2 to calculate the Checksum file, and then burn the Source files and the Checksum file onto a CD-ROM, later on you can verify that the Source files on the CD-ROM are still valid by running XCOMP/2 against them and the Checksum file. You no longer would need to have a second copy of the files to know if they are still valid, however if XCOMP/2's Checksum calculation tell you that the Checksums no longer match, you just know the files got corrupted but you still may need to obtain an uncorrupted copy of the files!

When comparing using CD, CD-R or CR-RW media, you may get alarmed by miscompares which disappear when running the same (or limited to a subdirectory only) comparison again. I'm not sure what causes that, but I suspect that misreads of the drive (e.g. weak bits) occured. Reading the same files again may just cause different read results due to the drive's head repositioning in a different way.

Examples

As with V2.00 the interpretation of the commandline arguments regarding the source and target paths became more intelligent, here are some examples on the usage (we are located at S:\Orig and have XCOPY'd all files from there into T:\Copy):

Command
Resulting comparison from XCOMP/2
XCOMP S:\Orig\* T:\Copy Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size The most trivial example using 2 fully qualified paths. Specifying the qualified filename with "*" could of course be replaced with one actually wants to compare, e.g. "*.ba*" or "????.ba?"
XCOMP S:\Orig\ T:\Copy Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size If you do not qualify the filenames to compare for the source, by default "*" is assumed.
XCOMP * T:\Copy Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size The source "*" is expanded to a fully qualified path by querying the current directory.
XCOMP . T: Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size If you specify a relative path like ".", it is expanded to a fully qualified path.
As no qualified filename is specified too, "*" is assumed. Using ".*" instead would have avoided that assumption.
As no directory is specified for the target path, the current directory on that drive is assumed. Having specified "T:." would have led to the same result.
XCOMP S: T: Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size As you didn't specify a path, the current directories "S:\Orig" and "T:\Copy" will be taken automatically.
As no qualified filename is specified too, "*" is assumed.
XCOMP ..\Orig\ T:..\Copy Comparing files qualified by * at Source path S:\Orig\ with Target path T:\Copy\ using 2097152 bytes buffer size The relative path "..\Orig\" is expanded to "S:\Orig\" and as no qualified filenames are specified "*" is assumed.
The same applies to "T:..\Copy" which gets expanded to "T:\Copy\*".
XCOMP \\pka1002s\Orig\ \\pka1002d\Copy Comparing files qualified by * at Source path \\pka1002s\Orig\ with Target path \\pka1002d\Copy\ using 2097152 bytes buffer size For UNC-Names, things are treated the same way, e.g. as no qualified filename is available here, "*" is assumed.
XCOMP S:\* /Chk:S:\ChkSum.Log /Log Calculating CRC32 and MD5 checksums qualified by * at Source path S:\ writing Checksum file into S:\CHKSUM.LOG using 8388608 bytes buffer size and logging into H:\PROGRAMMING\XCOMP\XCOMP.Log As the Checksum file did not exist previously, it will be created (containing the CRC32 and MD5 checksums and the fully qualified filename of each qualified file.

Ideally, you would run XCOMP/2 that way before saving all files on Q: and the newly Checksum file onto a backup media (e.g. burning that onto CD-ROM).

If you (after e.g. having burnt the files onto a CD-ROM in drive Z: at Z:\Backup) and would run: XCOMP Z:\Backup\* /Chk:Z:\ChkSum.Log /Log once again, XCOMP/2 would now compare the Checksum file with the Checksums calculated from the files on the CD-ROM:

Calculating CRC32 and MD5 checksums qualified by * at Source path Z:\Backup\ reading Checksum file into Z:\CHKSUM.LOG using 8388608 bytes buffer size and logging into H:\PROGRAMMING\XCOMP\XCOMP.Log In case you have burnt the data at S:\ into the root of the CD-ROM at Z:\, and had run: XCOMP Z:\* /Chk:Z:\ChkSum.Log /Log, XCOMP/2 notice that the Checksum file Z:\ChkSum.log is within the path Z:\ of your backup.
As it doesn't make sense to include the Checksum file itself in the Checksum calculation, XCOMP/2 will skip that file writing: XCOMP143: Skipping specified checksum file

If you find any inconsistency, or have a suggestion, please tell me!

Verifying burned CD-R's and CD-RW's

As said, my primary motivation to write XCOMP/2 was to compare what has been burned onto a CD-R or CD-RW with the original data, as I do not trust write operations that have no explicit verification, e.g even if it takes some time I do verify the data when copying something to a floppy disk media.

And from my personal experience I'm right being sceptical here. I already have encountered a few CD-Rs and CD-RWs that contained currupted data. I have seen effects that best can be described as weak bits, which used to be a copy-protection mechanism for games on floppy disks. Weak bits are bits read from a media sometimes incorrectly and sometimes correctly, and that rate may even be different between different CD-ROM drives.

The technical reasons seem to be (and looking around in the Internet confirms that):

Considering all that, I'm quite sure that one should spend the short time required to verify what has been put onto a CD-R or CD-RW. For important data that is not easily reproducably I recommend even to burn at least 2 copies.

Handling ISO image files (and RSJ track image files)

XCOMP/2 allows you to do a comparison at the file level, that is you need one or more source files that will be compared to one or more target files. Sometimes however, you don't have the source files have burnt onto a CD-ROM, but only the corresponding ISO image (or RSJ track image).

An ISO image is just the datastream in a single file, that makes up the contents of a CD-ROM session from the beginning to the end. Again, you want to be sure that the contents of the ISO image file you have created a CD-ROM from exactly equals wioth what has been written onto the CD-ROM.

There's a solution for that problem too, just use the ISOCOMP/2 tool, which lets you compare a CD-ROM's contents with the ISO image (or RSJ track image) file it has been burnt with.

What's new

The following items have changed between V3.10 and its predecessor 3.00:

The following items have changed between V3.00 and its predecessor 2.50:

The following items have changed between V2.50 and its predecessor 2.40:

The following items have changed between V2.40 and its predecessor 2.30:

The following items have changed between V2.30 and its predecessor 2.20: The following items have changed between V2.20 and its predecessor 2.10: The following items have changed between V2.10 and its predecessor 2.01: The following items have changed between V2.01 and its predecessor 2.00: The following items have changed between V2.00 and its predecessor 1.50:

CRC32 and MD5 checksum logic

The calculation of CRC32 and MD5 checksums is based on what that seems to be the standard algorithm used on the Internet. As I'm certainly not an expert in cryptology, and thus can't tell how good the quality of the CRC32 and MD5 checksums is (AFAIK it's rather good) , I used the sample codes I found on the Internet after a little searching as a starting point.

The CRC32 implementation I dereived from some freely available sample codes, the MD implementation is derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm sample implementation, which is also freely available if above identification is included on materials based on it.

The Checksum file in XCOMP/2 format looks like similar to:

CRC32: 36C8980A MD5: B924D691FA3C927737915B67386A357C Path: \programming\xcomp\FILE_ID.DIZ CRC32: ED340765 MD5: EBA0C6FE8AA2540E7AC65013C93F498A Path: \programming\xcomp\IsoComp.Cpp CRC32: C18BAF14 MD5: DB98BC97CB5A7B3195F0839BDDB00DC7 Path: \programming\xcomp\IsoComp.def CRC32: 78349322 MD5: 1B925D121F36090F4A1A44608512AE6C Path: \programming\xcomp\IsoComp.exe CRC32: 5AB17915 MD5: BA4AFCBE90B519BFB42E7D73583BE482 Path: \programming\xcomp\IsoComp.Hpp CRC32: 26186B92 MD5: 966F23969EA14AC19EE924D45DD0CB88 Path: \programming\xcomp\IsoComp.html ...

The Checksum file in MD5SUM format looks like similar to:

b924d691fa3c927737915b67386a357c *FILE_ID.DIZ eba0c6fe8aa2540e7ac65013c93f498a *IsoComp.Cpp db98bc97cb5a7b3195f0839bddb00dc7 *IsoComp.def 1b925d121f36090f4a1a44608512ae6c *IsoComp.exe ba4afcbe90b519bfb42e7d73583be482 *IsoComp.Hpp 966f23969ea14ac19ee924d45dd0cb88 *IsoComp.html ...

For MD5SUM compatible Checksum files you may need to specify the option /!S additionally, as MD5SUM does not seem to be able to recurse into subdirectories (at least I couldn't figure it out).

If you write the Checksum file into the path of the files you requested the comparison or Checksum calculation to be done, you would modify the Checksum file in that path while XCOMP/2 has already started its Checksum calculations (and that modification would of course result into different Checksum file contents). In other words, you would remove the floor under XCOMP/2's feet!
Thus XCOMP/2 tries to detect the Checksum file in the filesystem and ignore it for the Checksum processing.

XCOMP/2 has to handle that situations, and thus you have to observe:

XCOMP/2 download

You are welcome to download XCOMP/2 V3.10, which includes the OS/2 version and WIN32 port of XCOMP/2 and ISOCOMP/2 (inclusive its source written with VisualAge C++) from this site. You should also be able to find it on sites connected with a higher bandwidth like e.g.Hobbes.

XCOMP/2 source code

XCOMP/2 includes the complete source code to recompile it. Just run Protect xcomp2v310 Source.zie to obtain the source code archive Source.zip out of the encrypted file Source.zie.

Note: You have to have access to an OS/2 PC to decrypt the encrypted source code archive, because I provide the decryption executable only for OS/2. These tools are for OS/2 users who are forced to used Windows sometimes, but not for users that uncritically help to extend the power of the evil's empire monopoly!

Then just unzip Source.zip, preferably at the path you want to compile XCOMP/2 from, both for OS/2 and WIN32 source code and the WIN32 executables.

To compile XCOMP/2 for OS/2 you need:

To compile XCOMP/2 for WIN32 you need:


(C) Roman Stangl (Roman_Stangl@at.ibm.com), 18.07.2000
Last update: 22.05.2002 1