I recently wrote a small Turbo Pascal utility that I would like to make available to the Epi Info community.
Importing really large text files into Epi Info REC format can be a very frustrating experience. Let's say you are trying to write an accurate QES file to prepare the import of a 15 MB text file. You've received a record description of the text file and have carefully copied the variable types, lengths and positions into your QES file. You use ENTER.EXE to create an empty REC file and then start up IMPORT.EXE to read the text file into REC format.
And it blows up. The record length is wrong someplace in those 15 megabytes. Where? Why? The file might be downloaded from a mainframe or exported from another statistics package and something went awry during the process. Is the file corrupted? How are you going to document this in such a large file? A single deviant record is enough to throw the entire REC file into chaos when you try to analyse with TABLES or FREQ.
Even if you manage to load the entire file into Norton Editor or something similar your troubles are far from over. Maybe your record description is inaccurate. How are you going to convince anybody that this is the case?
If this means anything to you you've probably been in this situation. I sure have. So I recently wrote IMPTEST.EXE.
IMPTEST.EXE is a Turbo Pascal EXE file that takes a single argument when executed from the DOS command line: the name of a text file you want to test. For example:
IMPTEST MYPAIN.TXT
IMPTEST.EXE's output is an Epi Info REC file called IMPTEST.REC. IMPTEST.REC contains a single Integer 4 variable called ROWWIDTH and as many records as there are rows in your input text file. ROWWIDTH is simply holding the number of bytes in each row of the input text file. You can load IMPTEST.REC with READ in ANALYSIS and run FREQ ROWWIDTH for a revealing look at the structure of you text file. If there is a single deviant record (or loads of them) you can easily localize them by checking the internal variable RECNUMBER.
IMPTEST.EXE can handle record lengths up to 2500 bytes.
You are welcome to download and use IMPTEST.EXE with the usual understanding that you are doing so entirely on your own responsibility. If anyone is interested in the source code please let me know by email.