Multiple output files with ADB_BldOutLineFnc and FileMux.Exe
(If you're not interested in ADB, you can skip this section.)
(This is the most efficient way to create multiple output
files when there are a lot of records in each file.)
FileMux.Exe is a utility that I wrote for taking a single input
file and dividing the records into mutually exclusive and
mutually exhaustive subsets.
(In simple language, it splits the records in the file up into
groups :-)
FileMux.Exe can split records up based on the data in the input
records.
Let's see a quick example:
A This is record 1
B This is record 2
A This is record 3
Now suppose you run FileMux like so (constants are in capital
letters):
FileMux -INmyfile.inp -PREFout -EXT.dat -COL1 -LEN1
That says to split MyFile.Inp up into different parts, based on
the data in column 1. Two output files will be created:
"OutA.Dat" and "OutB.Dat". "OutA.Dat" will have records 1 and 3.
"OutB.Dat" will have record 2.
FileMux's default is to keep the field that you use to split
the records up (in this case, the field beginning at column 1,
which goes for 1 character).
If you want to remove that field, code the "-NOKEEP" argument.
(FileMux, like all the utilities supplied with ADB, prints its
command line arguments when you run it from the DOS prompt
with no arguments. Note also that FileMux's command line
parameter names are case-INsensitive, like all ADB's utilities.)
If you have a background in computer science (or in hardware),
you'll recognize the term "Mux" as short for "multiplexor". A
multiplexor is a hardware device that has one input and multiple
outputs. Based on a specific set of bits in an incoming word
(AKA a set of bytes), a multiplexor will "route" the rest of the
word to a specific output.
(If the above paragraph made no sense to you, just ignore it.)
How do you use ADB in tandem with FileMux?
The best way is to set up a function that will "prefix" output
records with "codes" that determine which output file the record
is going to.
Let's return to our example of customer invoices.
Here are the ADB definitions for that record format:
FLD Cust_No 4 # Customer # (treat as string)
FLD Invoice_Amt__ 8 2 # Invoice amount (fixed-decimal)
FLD Cust_State 2 # 2-char state code
FLD Err_Flag 1 # Error flag (0 for no error)
FSET Cust_Srt; Cust_No END # Sort on customer#
FSET Invoice_File # Invoice file
Cust_No; Invoice_Amt__;
Cust_State; Err_Flag END
As you recall, the records looked like this, with the exception
of the last one:
1111 10.00CA0
1111 15.00CA0
2222 25.00NY0
2222 10.00NY0
3333 20.00IN0
(This new last record is for a customer in IN=Indiana, and it
will turn out that Indiana isn't in our sales tax rate table.)
We're going to write the code which will split customers up, based
on their state. Customers from CA will go into one file, those
from NY into another file, and those with unrecognized sales
taxes into a third file (these are errors, since all customers
are currently located only in NY and CA.)
For convenience, I'm going to write a little utility function
called Mux_Out.
Mux_Out will be called with two arguments: an output file prefix,
and the line which is to be outputted.
function Mux_Out (OutFilePref, Data) { # Route to proper file
printf("%s%s\n", OutFilePref, Data);}
Here's our old familiar friend, the function that defines sales
tax rates:
function Define_Sales_Tax_Rate_Table () {
Sales_Tax_By_State["CA"] = 0.085;
Sales_Tax_By_State["NY"] = 0.0925;
}
Since both of these functions are useful for other jobs in our
application, we'll toss them into the file GlobFncs.Awk (this is
where our global application functions are going).
Let's not forget that the first two lines of our DOS batch file will
define our project file suffix and our "ADBAWKLIB" file:
Set ADBPROJ=xx
Set ADBAWKLIB=GlobFncs.Awk
To keep things simple, I'm just going to be concerned with splitting
the customers into categories by state.
Let's assume that the records are in the physical file called
"invoice.xx" (remember, "invoice" is the name of the file, and
"xx" is our project file suffix.)
We have three types of work to do now.
First, let's finish off the ADB defintions. We need
to define our input file and our output file:
TRANS Invoice_File Cust_Srt SORT
T-M Invoice_File AWK END
Note that the TRANS statement defines the input invoice
records. The "T-M" statement is for unmatched
transactions (i.e. all of them, since there is no master file for
this run).
Let's assume that the output file for this run is "invsplit.xx"
So the command line for ADB is going to look like this:
ADB -TRANSinvoice -T-Minvsplit
Once we define the script for the output Awk file, we'll be done.
Since this is the output Awk file script for the file "invsplit.xx",
its name will be "invsplit.xxA" (same name, but with an "A" on the
end of the extension).
First, we'd better not forget to initialize the state sales tax
rate table:
function ADB_BOJ() {Define_Sales_Tax_Rate_Table();}
Now we have to define ADB_Main, which is the function that will
be run for each record:
function ADB_Main( File_Code) {
if (Cust_State in Sales_Tax_By_State) File_Code = Cust_State;
else File_Code = "ZZ"; # A bad state code
Mux_Out(File_Code, ADB_BldOutLineFnc());
}
File_Code is a local variable. This is going to be the
2-character state code, for all recognized states.
If a state isn't recognized, we're going to use a File_Code of
"zz".
The third line in this function actually calls Mux_Out.
What does ADB_BldOutLineFnc do? It actually builds the output
line in the very same format as the output file.
Since the output file format is the same as the input file
format, ADB_BldOutLineFnc returns a string which is identical to
the input record.
This frees you from having to think about the sequence of fields,
their data types, their lengths, their offsets, etc. Just as any
good DBMS product should.
So what does "invsplit.xx" look like? This is the output file
that we're creating:
CA1111 10.00CA0
CA1111 15.00CA0
NY2222 25.00NY0
NY2222 10.00NY0
ZZ3333 20.00IN0
Hmmm. Looks like we haven't done much
work. The state code is now replicated in the first
two bytes of the record, right?
Not quite.
The last record has "ZZ" in the first two
bytes. That's because "IN" isn't a recognized state
code (i.e. it's not in our sales tax rate table.)
Finally, we run FileMux.Exe:
FileMux -INinvsplit.xx -COL1 -LEN2 -NOKEEP -PREFInv -EXTxx
This is going to create 3 files:
InvNY.xx:
2222 25.00NY0
2222 10.00NY0
InvCA.xx:
1111 10.00CA0
1111 15.00CA0
InvZZ.xx:
3333 20.00IN0
FileMux, in conjunction with a function similar to
Mux_Out, and
ADB's built-in function
ADB_BldOutLineFnc . . . can all work together
to allow you to split an output file into pieces, &/or to
generate several output files with different layouts &/or reports
from the same ADB run.
Note that ADB's built-in function that we saw earlier,
(ADB_WriteFnc) just looks like this:
function ADB_WriteFnc() {printf("%s\n", ADB_BldOutLineFnc());}
It's ADB_BldOutLineFnc which does the hard work of assembling all
the fields in the output record, and ensuring that they're
outputted with the proper lengths and offsets.
Note that ADB won't stop you from you modifying any of the input
fields to be longer than they should be. Future releases may
address this potential problem area.
In a "real" application, we might want to "bomb the job," i.e. to
print some special message if the file InvZZ.xx exists. We can
do that with a simple DOS batch file statement:
If Exist InvZZ.xx ErrMsg BAD STATE CODES WERE FOUND!
If ErrorLevel 1 Goto ABORT_JOB_LABEL
These two statements in the DOS batch file would immediately follow
the invocation of FileMux. Presumably, ABORT_JOB_LABEL would
contain a helpful message that tells the user what to do when things
haven't run properly.
ErrMsg.Exe is another utiilty that comes with
ADB. It prints all of its command line arguments and returns 1.
You can run it with no arguments from the DOS prompt for
execution examples.
ErrMsg saves you from writing:
If ErrorLevel 1 Echo BAD STATE CODES WERE FOUND!
If ErrorLevel 1 Pause
If ErrorLevel 1 Goto ABORT_JOB_LABEL
Saving one line in a batch file might not seem like much, but if
you have a dozen steps in an ADB process, it can help make the
file a bit more elegant.
Return
to local table of contents
Return
to global table of contents
Frames mode, or
No frames