4. Fields, Fieldsets, Overall syntax

This section covers field and "field set" definitions, plus the overall syntax rules.

Feel free to mail your comments to Rog at: rog@NOSPAM_rs-freeware.org.

Return to the global table of contents
Frames mode, or No frames


1:  Syntax in General

2:  "Ordinary" string fields

3:  Integer fields

4:  Fixed-Decimal fields

5:  "Special" fields

6:  LIKE fields

7:  Field Sets


1:  Syntax in General

There are up to two definitions files allowed.

ADB's definition files are used to hold all field, fieldset, and input/output file definitions.

The file ADB.xx (where "xx" is your project file suffix) must exist (although it may be empty).

This is your global definitions file, where you'll probably define all of the fields and fieldsets used in more than one jobstream in your application.

Or, for small (one-operation) jobs, the global definitions file may contain everything.

Any file referenced in the "-STEP" command-line parameter will be taken to be your local (AKA "step") defintions. These are read in after your global definitions.


Each of the 5 types of statements allowed in an ADB definitions file consists of a series of "items".  For example, the FLD statement has up to 4 items:


FLD fieldname length {precision}

(I've put "precision" in curly braces because it's not used all the time.)


Items can be separated by one or more spaces and/or tabs. You may also use commas or semicolons.

It's not legal to use a comma followed immediately by a semicolon, or two consecutive commas, or two consecutive semicolons.

However, you may use as many spaces and/or tabs as you like, in conjunction with up to one comma.  Or, you may use as many spaces and/or tabs as you like in conjunction with up to one semicolon.

These rules have been designed so that you can "do what comes naturally."


You can also vary the use of commas and/or semicolons to suit any stylistic conventions that you might wish to adopt (to make your definitions as easy to read as possible).


Empty lines, lines consisting of all whitespace (spaces and/or tabs) are ignored.  Spaces and/or tabs occuring on the left or the right of a line are also ignored.


There's no "mutiline" comment convention (as in the "/* ... */" sequences allowed in C and/or JavaScript).  You may use pound sign ("#"), the exclamation point ("!"), or the slash ("/") for comments.  (These aren't recognized inside quoted strings).

Whenever a comment character is encountered, the rest of the line is ignored.


Where quoted strings are specified, you may use double quotes, single quotes, or the backquote (below the "tilde" on most WinTel keyboards), as long as your quotation marks are "balanced" (i.e. the same choice is used on the left as on the right).


With the exception of the FSET statement, all statements must begin and end on one physical line.  (There's really no reason to continue any of the others).


Identifiers are used only for fields and fieldsets.  They must begin with a letter (underscores count as letters), and can contain underscores, letters, or digits in any combination.

Identifiers can't exceed 25 characters in length (this is because Awk has a maximum of 32, and I want to leave some space for expansion.)


Note that when one or more underscores occur at the end of the name of a field, they play a special role in determining the type of a field.


You can use the same names for a field as a field set (although it's questionable coding practice IMO).


Either a field or a fieldset may be defined multiple times, so long as each definition is identical.


No identifier can be identical to any of the following 14 reserved words:


Awk
CQ
End
Fld
FSet
Mast
M-T
M&T
NoDups
None
Prep
Sort
Trans
T-M


Remember that everything about ADB is case-INsensitive, with the exception of field names  (but only when used in the Awk interface, since Awk itself is case-sensitive), and the DOS Environmental variables  (case-sensitive for DOS 6.0 and up).

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

2:  "Ordinary" string fields

Here's an example of each of the four types of fields recognized by ADB:


# Invoice fields: invoice #, date, amount and cust#
FLD Inv_No            10    ! A regular string field
FLD Inv_Date___       10    ! Date field from the DMBS
FLD Inv_Amnt__        12 2  ! Fixed decimal: -00000000.00
FLD Inv_Cust_No_       9    ! Integer (8/9 digits)

The first field is an "ordinary" string field, (as are all fields whose names don't end in an underscore).

If it's imported in commas-n'-quotes format, it must be surrounded by double quotes.  Ordinary string fields are left-justified  (it's impossible to have an ordinary string field that begins with a space or a tab.)

Whenever you process an ordinary string field with the Awk interface, all spaces will have been stripped from the right.

Whenever an ordinary string field is "exported" in commas-n'-quotes format, it will be surrounded by double quotes.

BTW, it's not a very good idea to have fields in an ADB system that have tab characters on the right if you're planning to use the Awk interface.  (I've yet to encounter an application where such beasts are required; however the tab character seems to have more than its share of admirers among technical people.)

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

3:  Integer fields

If a field name ends in exactly one (but no more) underscores, it's considered to be an integer field.  Integer fields are always left-justified and padded with spaces on the right.

(For those of you who aren't programmers: the term "integer" refers to a whole number, i.e. one without a decimal point.  It can be positive, zero, or negative.)

If you're going to be using the Awk interface, make sure you define the size of an integer field to be one larger than the maximum number of digits you expect--unless you're absolutely certain that all values will be nonnegative.  (If you fail to do so, Awk will still write out the entirety of the field, but this will throw the record length off and could thereby cause an error.)


Note that Awk doesn't have integers--it only has strings and floating point values.  If you manipulate integer fields and then write them back out again, Awk will automatically round them for you.  You may prefer to ensure that their values remain integer, or to use Awk's built-in "int" function to strip any unnecessary fractional values  (or write your own rounding function).


It's not required that fields ending in exactly one underscore have integer values--it only matters in the Awk interface  (these fields will have have a value of zero if you try to manipulate them.)


Integer fields must have a length of at least 2.  There has to be at least one byte for the sign.

Integer fields may or may not be surrounded by double quotes, when imported via the commas-n'-quotes interface.  However, when exported, they'll never be surrounded with double quotes.

Note that excess zeros on the left, or a plus sign included for positive values) will all be handled just fine by the Awk interface when incoming; however they won't be written back to the output in that manner.  (Since I don't know of any DBMSs that require this sort of "babying," I haven't wished to "clutter" ADB's design with the necessary constructs.)


If you need to use the Awk interface to compute values for integer fields, a plus or minus sign can go on either the left or the right.  However, you can't put spaces between the sign and digits.  Very few DBMSs or other programs that output "ASC" numeric data will code such excess spaces, but if you run into one that does, you'll have to define the field as "special" (see below) and write Awk functions that convert the format.  We'll face this very problem in the example.

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

4:  Fixed-Decimal fields

If a field name ends in exactly 2 underscores, it's considered to be fixed decimal.  It's always left justified, and spaces are padded on the right.

(For those of you who aren't computer programmers: a fixed-decimal field is a number that can be positive, negative, or zero, which has a specific number of decimal places.)

Fixed-decimal fields are the only fields for which a fourth item (the precision) can and must be specified.

This fourth item is the number of digits after the decimal point.  The length of the field (third item) reflects the total length.  The total length must be at least 3 larger than the precision  (to allow space for at least one digit to the left of the decimal point, a sign, and the decimal point itself).

Therefore the minimum total length for a fixed decimal field is 4.

The maximum precision for a fixed decimal field is 16.  The minimum is 1.


As with integer fields, it doesn't matter if fixed decimal fields actually contain fixed decimal values, unless you're going to be using the Awk interface.  If you are, and these fields contain something besides a minus sign on the left, digits, or exactly one properly-formatted decimal point, their values may be unpredictable  (you'll need to play around with Awk to see what it does; more on this when I discuss the Awk interface.)

As with integer fields, it's acceptable if fixed decimal fields are surrounded by double quotes when imported via the commas-n'-quotes interface, but it's not required.  Whenever such fields are exported via commas-n'-quotes, they are unquoted.

As with integer fields, excess zeros on the left, or a plus sign included for positive values) will all be handled just fine by the Awk interface when incoming; however they won't be written back to the output in that manner.

As with integer fields, if you need to use the Awk interface to compute values for integer fields, a plus or minus sign can go on either the left or the right.  However, you can't put spaces between the sign and digits.  Very few DBMSs or other programs that output "ASC" numeric data will code such excess spaces, but if you run into one that does, you'll have to define the field as "special" (see below) and write Awk functions that convert the format.  We'll face this very problem in the example.

Awk does support exponential floating-point syntax for scientific applications.  See the Awk documentation in GAwk.Doc for more details.

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

5:  "Special" fields

Fields whose names end in 3 or more underscores are considered "special".

Special fields are treated just like string fields in every way, *except* that they need not have double quotes around them when imported in commas-n'-quotes format, nor are they provided with double quotes when exported in commas-n'-quotes format.

Unfortunately, a number of DBMSs have a tendency to leave off the double quotes when exporting fields that contain time, date, and/or currency values; they also expect these fields to be unquoted when importing them via commas-n'-quotes format.

In the case of currency, this can present some annoying (but easily overcome) obstacles in the Awk interface.  Some DBMSs will add a currency symbol on left; others on the right.


In the example, I'll presume that the dollar sign is kept on the left, and I'll write "convert-to-decimal" and "convert-to-currency" functions which will help you to process such fields.  If your DBMS puts currency symbols on the right, you'll have to modify that function.  (In any event, not a lot of code is required.)

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

6:  LIKE fields

For various reasons, it's often convenient to be able to say that a particular field has the same "data type" as another field.

This is a common technique in C/C++ ... think of the idea of a typedef.  For example:


typedef CUST_NO_TYPE int;
CUST_NO_TYPE Cust_No;

ADB has a similar construct, known as LIKE:


FLD Cust_No_Type 10  # Customer numbers are 10 characters
...
FLD Cust_No LIKE Cust_No_Type ! This is the customer account ID
FLD Bill_No Like Cust_No_Type ! This is the billing account ID

In this case, we defined the type of a customer number.  Then we defined the actual customer number filed, followed by a "bill #" field.

Our "bill #" field might be helpful in the case in which a customer has many subdivisions which can place orders (such as separate warehouses), but wants the orders billed to a central location.

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

7:  Field Sets

The FSET statement is the only statement that can span more than one line.

The format is:


FSET fieldset_name field1 . . . fieldn END

ADB knows that an FSET statement has terminated when it sees the reserved word "END".

"END" may be on a line by itself.

So the following are all acceptable:


FSET fieldset_name
field1
field2
END

or


FSET fieldset_name
field1 field2 END

or


FSET fieldset_name field1 field2
field3 field4 END

This is *not* acceptable:


FSET
fieldset-name
field1 field2 END

(The fieldset name must be on the same line with the FSET; any other rule would make it harder to detect certain kinds of syntax errors and provide error messages that are easy to understand.)

This is also *not* acceptable:


FSET fieldset-name END

(At least one field has to be in a field set.)


Note that you can't start another ADB statement on the same line as the END.


You can use the name of a field set (preceeded immediately by an at-sign) to specify that the fields in that set will be "merged" into the current fieldset.

For example, suppose you define


FSET InvoiceKey InvoiceNo CustNo END

Then this:


FSET Invoice InvoiceNo CustNo InvoiceAmt__ END

is identical to:


FSET Invoice @InvoiceKey InvoiceAmt__ END

The "merged in" fields are treated just as if you'd coded them originally.


Finally, you can prefix a field with a minus sign, if you don't want it to be included.  Any field prefixed with a minus sign must have already been included.

This is primarily useful for removing fields that are merged in as part of an "@" declaration.  Here's a quick example:


FSET InvoiceKey InvoiceNo CustNo InvDate END
FSET Invoice InvoiceNo CustNo InvoiceAmt__ END

is the same thing as:


FSET InvoiceKey    InvoiceNo CustNo InvDate END
FSET InvoiceShorty @InvoiceKey InvoiceAmt__ -InvDate END

In this case, we decided not to include the InvDate field in the InvoiceShorty fieldset.  Since it was part of the InvoiceKey fieldset, we used the minus sign syntax to remove it.

Note that it's illegal to have a fieldset with zero fields.

Return to local table of contents
Return to global table of contents
Frames mode, or No frames

Next documentation section