NAG Library Routine Document
E04MXF
1 Purpose
E04MXF reads data for a sparse linear programming, mixed integer linear programming, quadratic programming or mixed integer quadratic programming problems from an external file which is in standard or compatible MPS input format.
2 Specification
SUBROUTINE E04MXF ( |
INFILE, MAXN, MAXM, MAXNNZ, MAXNCOLH, MAXNNZH, MAXLINTVAR, MPSLST, N, M, NNZ, NCOLH, NNZH, LINTVAR, IOBJ, A, IROWA, ICCOLA, BL, BU, PNAMES, NNAME, CRNAME, H, IROWH, ICCOLH, MINMAX, INTVAR, IFAIL) |
INTEGER |
INFILE, MAXN, MAXM, MAXNNZ, MAXNCOLH, MAXNNZH, MAXLINTVAR, MPSLST, N, M, NNZ, NCOLH, NNZH, LINTVAR, IOBJ, IROWA(MAXNNZ), ICCOLA(MAXN+1), NNAME, IROWH(MAXNNZH), ICCOLH(MAXNCOLH+1), MINMAX, INTVAR(MAXLINTVAR), IFAIL |
REAL (KIND=nag_wp) |
A(MAXNNZ), BL(MAXN+MAXM), BU(MAXN+MAXM), H(MAXNNZH) |
CHARACTER(8) |
PNAMES(5), CRNAME(MAXN+MAXM) |
|
3 Description
E04MXF reads linear programming (LP) or quadratic programming (QP) problem data or their mixed integer variants from an external file which is prepared in standard or compatible MPS (see
IBM (1971)) input format. It then initializes
(the number of variables),
(the number of general linear constraints), the
by
matrix
, the vectors
,
,
(stored in row
IOBJ of
) and the
by
Hessian matrix
for use with
E04NKF/E04NKA and
E04NQF.
These routines are designed to solve problems of the form
The input file of data may only contain two types of lines:
- Indicator lines (specifying the type of data which is to follow).
- Data lines (specifying the actual data).
A section is a combination of an indicator line and its corresponding data lines. Any characters beyond column 80 are ignored. Indicator lines must not contain leading blank characters (in other words they must begin in column 1). The following displays the order in which the indicator lines must appear in the file:
NAME |
user-supplied name |
(optional) |
OBJSENSE |
(optional) |
|
data line(s) |
OBJNAME |
(optional) |
|
data line(s) |
ROWS |
|
data line(s) |
COLUMNS |
|
data line(s) |
RHS |
|
data line(s) |
RANGES |
(optional) |
|
data line(s) |
BOUNDS |
(optional) |
|
data line(s) |
QUADOBJ |
(optional) |
|
data line(s) |
ENDATA |
A data line follows the same fixed format made up of fields defined below. The contents of the fields may have different significance depending upon the section of data in which they appear.
|
Field 1 |
Field 2 |
Field 3 |
Field 4 |
Field 5 |
Field 6 |
Columns |
|
|
|
|
|
|
Contents |
Code |
Name |
Name |
Value |
Name |
Value |
The names and codes consist of ‘printable’ characters only. Values are read using a field width of . This allows values to be entered in several equivalent forms. For example, , , and all represent the same number. It is safest to include an explicit decimal point.
Lines with an asterisk () in column one will be considered comment lines and will be ignored by the routine.
Columns outside the six fields must be blank, except for columns 72–80, whose contents are ignored by the routine. These columns may be used to enter a sequence number. A non-blank character outside the predefined six fields and columns 72–80 is considered to be a major error (
; see
Section 6), unless it is part of a comment.
3.1.1 NAME Section (optional)
The NAME section is the only section where the data is on the same line as the indicator. The ‘user-supplied name’ must be in field
on the same line as the NAME indicator line. The ‘user-supplied name’ may be blank.
Field |
Required |
Description |
|
No |
Name of the problem |
3.1.2 OBJSENSE Section (optional)
The data line in this section can be used to specify the sense of the objective function. If this section is present it can contain only one data line. If OBJSENSE section is missing or empty, minimization is assumed.
Field |
Required |
Description |
|
No |
Sense of objective function |
Field 2 may contain either MIN, MAX, MINIMIZE or MAXIMIZE.
3.1.3 OBJNAME Section (optional)
The data line in this section can be used to specify the name of a free row (see
Section 3.1.4) that should be used as the objective function. If this section is present it can contain only one data line. If OBJNAME section is missing or is empty, the first free row will be chosen instead. Alternatively, OBJNAME can be overridden by setting nonempty
.
Field |
Required |
Description |
|
No |
Row name to be used as the objective function |
Field 2 must contain a valid row name.
3.1.4 ROWS Section
The data lines in this section specify unique row (constraint) names and their inequality types (i.e., unconstrained,
,
or
).
Field |
Required |
Description |
|
Yes |
Inequality Key |
|
Yes |
Row name |
The inequality key specifies each row's type. It may contain the values
E,
G,
L,
N and can be in either column
or
.
Inequality Key |
Description |
|
|
N |
free row |
|
|
G |
Greater than or equal to |
finite |
|
L |
Less than or equal to |
|
finite |
E |
Equal to |
finite |
|
Row type
N stands for ‘Not binding’, also known as ‘Free’. It can be used to define the objective row. The objective row is a free row that specifies the vector
in the linear objective term
. If there is more than one free row, the first free row is chosen, unless another free row name is specified by OBJNAME (see
Section 3.1.3) or
(see
Section 5). Note that
is assumed to be zero if either the chosen row defined does not appear in the COLUMNS section (i.e., has no nonzero elements) or there are no free rows defined in the ROWS section.
3.1.5 COLUMNS Section
Data lines in this section specify the names to be assigned to the variables (columns) in the general linear constraint matrix
, and define, in terms of column vectors, the actual values of the corresponding matrix elements.
Field |
Required |
Description |
|
Yes |
Column name |
|
Yes |
Row name |
|
Yes |
Value |
|
No |
Row name |
|
No |
Value |
Each data line in the COLUMNS section defines the nonzero elements of or . Any elements of or that are undefined are assumed to be zero. Nonzero elements of must be grouped by column, that is to say that all of the nonzero elements in the jth column of must be specified before those in the th column, for . Rows may appear in any order within the column.
3.1.6 Integer Markers
As is described later in this document, you are able to define any integer variables in the BOUNDS section of the MPS file. For backward compatibility E04MXF allows you to define the integer variables within the COLUMNS section using integer markers, although it is not recommended as markers can be treated differently by different MPS readers. Each marker line must have the following format:
Field |
Required |
Description |
|
No |
Marker ID |
|
Yes |
Marker Tag |
|
Yes |
Marker Type |
The marker tag must contain ‘MARKER’. Marker Type must be ‘INTORG’ to start reading integer variables and ‘INTEND’ to finish reading integer variables. This implies that a row cannot be named ‘MARKER’, ‘INTORG’ or ‘INTEND’. You may wish to have several integer marker sections within the COLUMNS section, in which case each section must begin with an ‘INTORG’ marker and end with an ‘INTEND’ marker and there should not be another marker between them.
Field 2 is ignored by E04MXF. When an integer variable is declared it will keep its default bounds unless they are changed in the BOUNDS section. This may vary between different MPS readers.
3.1.7 RHS Section
This section specifies the right-hand side values of the general linear constraint matrix
(if any).
Field |
Required |
Description |
|
Yes |
RHS name |
|
Yes |
Row name |
|
Yes |
Value |
|
No |
Row name |
|
No |
Value |
The MPS file can contain several RHS sets distinguished by RHS name. If RHS name is defined in
(see
Section 5) then E04MXF will read in only that RHS vector, otherwise the first RHS set will be used.
Only the nonzero elements need to be specified. Note if a RHS is given to the objective function it will be ignored by E04MXF. RHSs given to the objective function are dealt with differently by different MPS readers, therefore it is safer to not define a RHS of the objective function in your MPS file. Note that this section may be empty, in which case the RHS vector is assumed to be zero.
3.1.8 RANGES Section (optional)
Ranges are used to modify the interpretation of constraints defined in the ROWS section (see
Section 3.1.4) to the form
, where both
and
are finite. The range of the constraint is
.
Field |
Required |
Description |
|
Yes |
Range name |
|
Yes |
Row name |
|
Yes |
Value |
|
No |
Row name |
|
No |
Value |
The range of each constraint implies the upper and lower bounds depending on the Inequality Key of each constraint, as shown below.
Inequality Key |
Sign of |
|
|
E |
|
|
|
E |
|
|
|
G |
|
|
|
L |
|
|
|
N |
|
|
|
where
is the RHS of the constraint defined in the RHS section and
is the range.
If Range name is defined in
(see
Section 5) then the routine will read in only the range set of that Range name otherwise the first set will be used.
3.1.9 BOUNDS Section (optional)
These lines specify limits on the values of the variables (
and
in
). If the variable is not specified in the bound set then it is automatically assumed to lie between
and
.
Field |
Required |
Description |
|
Yes |
Bound type identifier |
|
Yes |
Bound name |
|
Yes |
Column name |
|
Yes/No |
Value |
Note: field 4 is required only if Bound type identifier is one of UP, LO, FX, UI or LI in which case it gives the value below. If Bound type identifier is FR, MI, PL or BV, field 4 is ignored and it is recommended to leave it blank.
The table below describes the acceptable Bound type identifiers and how they specify the variables' bounds.
Bound Type Identifier |
|
|
Integer Variable? |
UP |
unchanged |
|
No |
LO |
|
unchanged |
No |
FX |
|
|
No |
FR |
|
|
No |
MI |
|
unchanged |
No |
PL |
unchanged |
|
No |
BV |
|
|
Yes |
UI |
unchanged |
|
Yes |
LI |
|
unchanged |
Yes |
If Bound name is defined in
(see
Section 5) then the routine will read in only the bound set of that name otherwise the first set will be used.
3.1.10 QUADOBJ Section (optional)
The QUADOBJ section defines nonzero elements of one triangle of the Hessian matrix
.
Field |
Required |
Description |
|
Yes |
Column name (HColumn Index) |
|
Yes |
Column name (HRow Index) |
|
Yes |
Value |
|
No |
Column name (HRow Index) |
|
No |
Value |
Each data line in the QUADOBJ section defines one or optionally two nonzero elements
of the matrix
. Each element
is given as a triplet of row index
, column index
and a value. The Column names (as defined in the COLUMNS section) are used to link the names of the variables and the indices
and
. More precisely, the matrix
on output will have a nonzero element
where index
belongs to HColumn Index and index
to one of the HRow Indices such that
- and
- .
It is only necessary to define either the upper or lower triangle of the matrix; either will suffice. Any elements that have been defined in the upper triangle of the matrix will be moved to the lower triangle of the matrix, then any repeated nonzeros will be summed.
Note: it is much more efficient for
E04NKF/E04NKA and
E04NQF
to have the
matrix defined by the first
NCOLH column names. If the nonzeros of
are defined by any columns that are not in the first
NCOLH of
N then E04MXF will re-arrange matrices
and
so that they are.
3.2 Query Mode
E04MXF offers a way to quickly give upper estimates on the sizes of user arrays, so called ‘query mode’. In this mode any expensive checks of the data and of the file format are skipped only to promptly count the number of variables, constraints and matrix nonzeros. This might be useful in the usual case where the size of the problem is not known in advance.
The query mode can be intentionally triggered from the beginning by setting any of the following:
,
,
,
,
or
. If no major formatting error occurs in the file,
is returned and the upper estimates are given as stated in
Table 1. Alternatively, the routine switches to the query mode while the file is being read and if it is discovered that the provided space is insufficient (
,
,
,
,
or
). In this case
is returned.
The recommended practice is shown in the
Section 9 when the routine is run twice. The first time in the query mode, then the data arrays are allocated and E04MXF is called for the second time to read the data.
4 References
IBM (1971) MPSX – Mathematical programming system Program Number 5734 XM4 IBM Trade Corporation, New York
5 Parameters
- 1: INFILE – INTEGERInput
-
On entry: the identifier associated with the MPSX data file to read from.
Constraint:
.
- 2: MAXN – INTEGERInput
On entry: an upper limit for the number of variables in the problem.
If
, E04MXF will go into query mode (see
Section 3.2).
- 3: MAXM – INTEGERInput
On entry: an upper limit for the number of constraints (including the objective row) in the problem.
If
, E04MXF will go into query mode (see
Section 3.2).
- 4: MAXNNZ – INTEGERInput
On entry: an upper limit for the number of nonzeros (including the objective row) in the problem.
If
, E04MXF will start in query mode (see
Section 3.2).
- 5: MAXNCOLH – INTEGERInput
On entry: an upper limit for the dimension of the matrix
.
If
, E04MXF will start in query mode (see
Section 3.2).
- 6: MAXNNZH – INTEGERInput
On entry: an upper limit for the number of nonzeros of the matrix
.
If
, E04MXF will start in query mode (see
Section 3.2).
- 7: MAXLINTVAR – INTEGERInput
On entry: if
, an upper limit for the number of integer variables.
If
, E04MXF will start in query mode (see
Section 3.2).
If , E04MXF will treat all integer varables in the file as continuous variables.
- 8: MPSLST – INTEGERInput
On entry: if
, then a summary of what E04MXF is doing is sent to the current advisory message unit (as defined by
X04ABF). This can be useful for debugging the MPS data file. If
, then no summary is produced.
- 9: N – INTEGEROutput
On exit: if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of the number of variables of the problem. Otherwise,
, the actual number of variables in the problem.
- 10: M – INTEGEROutput
On exit: if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of the number of general linear constraints in the problem (including the objective row). Otherwse
, the actual number of general linear constaints of the problem.
- 11: NNZ – INTEGEROutput
On exit: if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of the number of nonzeros in the problem (including the objective row). Otherwise the actual number of nonzeros in the problem (including the objective row).
- 12: NCOLH – INTEGEROutput
On exit: if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of
the variable
NCOLH as used by
E04NKF/E04NKA and
E04NQF.
In this context
NCOLH is the number of leading nonzero columns of the Hessian matrix
. Otherwise, the actual dimension of the matrix
.
- 13: NNZH – INTEGEROutput
On exit: if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of the number of nonzeros of the matrix
. Otherwise, the actual number of nonzeros of the matrix
.
- 14: LINTVAR – INTEGEROutput
On exit: if on entry,
, all integer variables are treated as continuous and
.
Otherwise, if E04MXF was run in query mode (see
Section 3.2), or returned with
, an upper estimate of the number of integer variables of the problem. Otherwise, the actual number of integer variables of the problem.
- 15: IOBJ – INTEGEROutput
On exit: if
, row
IOBJ of
is a free row containing the nonzero coefficients of the vector
.
If , the coefficients of are assumed to be zero.
- 16: A(MAXNNZ) – REAL (KIND=nag_wp) arrayOutput
On exit: the nonzero elements of , ordered by increasing column index.
- 17: IROWA(MAXNNZ) – INTEGER arrayOutput
On exit: the row indices of the nonzero elements stored in
A.
- 18: ICCOLA() – INTEGER arrayOutput
On exit: a set of pointers to the beginning of each column of
. More precisely,
contains the index in
A of the start of the
th column, for
. Note that
and
.
- 19: BL() – REAL (KIND=nag_wp) arrayOutput
- 20: BU() – REAL (KIND=nag_wp) arrayOutput
On exit:
BL contains the vector
(the lower bounds) and
BU contains the vector
(the upper bounds), for all the variables and constraints in the following order. The first
N elements of each array contain the bounds on the variables
and the next
M elements contain the bounds for the linear objective term
and the general linear constraints
(if any). Note that an ‘infinite’ lower bound is indicated by
, an ‘infinite’ upper bound by
.
Note that E04MXF uses an ‘infinite’ bound size of
in the definition of
and
. In other words, any element of
greater than or equal to
will be regarded as
(and similarly any element of
less than or equal to
will be regarded as
). If this value is deemed to be ‘inappropriate’, you are recommended to reset the value of the optional parameter
Infinite Bound Size and make any necessary changes to
BL and/or
BU before calling
E04NKF/E04NKA and
E04NQF.
- 21: PNAMES() – CHARACTER(8) arrayInput/Output
On entry: a set of names associated with the MPSX form of the problem.
- Must either contain the name of the problem or be blank.
- Must either be blank or contain the name of the objective row (in which case it overrides OBJNAME section and the default choice of the first objective free row).
- Must either contain the name of the RHS set to be used or be blank (in which case the first RHS set is used).
- Must either contain the name of the RANGE set to be used or be blank (in which case the first RANGE set (if any) is used).
- Must either contain the name of the BOUNDS set to be used or be blank (in which case the first BOUNDS set (if any) is used).
On exit: a set of names associated with the problem as defined in the MPSX data file as follows:
- Contains the name of the problem (or blank if none).
- Contains the name of the objective row (or blank if none).
- Contains the name of the RHS set (or blank if none).
- Contains the name of the RANGE set (or blank if none).
- Contains the name of the BOUNDS set (or blank if none).
- 22: NNAME – INTEGEROutput
On exit: , the total number of variables and constraints in the problem (including the objective row).
- 23: CRNAME() – CHARACTER(8) arrayOutput
-
Note: that only the first eight characters of the rows of
CRNAME are significant.
On exit: the MPS names of all the variables and constraints in the problem in the following order. The first
N elements contain the MPS names for the variables and the next
M elements contain the MPS names for the objective row and general linear constraints (if any). Note that the MPS name for the objective row is stored in
.
- 24: H(MAXNNZH) – REAL (KIND=nag_wp) arrayOutput
On exit: the
NNZH nonzero elements of
, arranged by increasing column index.
- 25: IROWH(MAXNNZH) – INTEGER arrayOutput
On exit: the
NNZH row indices of the elements stored in
.
- 26: ICCOLH() – INTEGER arrayOutput
On exit: a set of pointers to the beginning of each column of . More precisely,
contains the index in of the start of the th column, for . Note that and .
- 27: MINMAX – INTEGEROutput
On exit:
MINMAX defines the direction of the optimization as read from the MPS file. By default the routine assumes the objective function should be minimized and will return
. If the routine discovers in the OBJSENSE section that the objective function should be maximized it will return
. If the routine discovers that there is neither the linear objective term
(the objective row) nor the Hessian matrix
, the problem is considered as a feasible point problem and returns
.
- 28: INTVAR(MAXLINTVAR) – INTEGER arrayOutput
On exit:
INTVAR contains pointers to the columns that are defined as integer variables. More precisely,
, where
is the index of a column that is defined as an integer variable, for
.
- 29: IFAIL – INTEGERInput/Output
-
On entry:
IFAIL must be set to
,
. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
is recommended. If the output of error messages is undesirable, then the value
is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is
.
When the value is used it is essential to test the value of IFAIL on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
Please note that if any of the relevant parameters are accidentally set to zero, or not set and assume zero values, then the routine will have executed in query mode. In this case only the size of the problem is returned and other parameters are not set. Please see
Section 3.2.
6 Error Indicators and Warnings
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
X04AAF).
Errors or warnings detected by the routine:
-
Warning: MPS file not strictly fixed format, although the problem was read anyway. The data may have been read incorrectly, therefore you may wish to run the program with for more details.
-
At least one of
MAXM,
MAXN,
MAXNNZ,
MAXNCOLH,
MAXNNZH or
MAXLINTVAR is too small. Suggested values are returned in
M,
N,
NNZ,
NCOLH,
NNZH and
LINTVAR respectively.
-
Incorrect ordering of indicator lines.
OBJNAME indicator line found after ROWS indicator line.
-
Incorrect ordering of indicator lines.
COLUMNS indicator line found before ROWS indicator line.
-
Incorrect ordering of indicator lines.
RHS indicator line found before COLUMNS indicator line.
-
Incorrect ordering of indicator lines.
RANGES indicator line found before RHS indicator line.
-
Incorrect ordering of indicator lines.
BOUNDS indicator line found before COLUMNS indicator line.
-
Incorrect ordering of indicator lines.
BOUNDS indicator line found before QUADOBJ indicator line.
-
Incorrect ordering of indicator lines.
QUADOBJ indicator line found before COLUMNS indicator line.
-
Unknown indicator line ‘’.
-
Indicator line ‘’ has been found more than once in the MPS file.
-
End of file found before ENDATA indicator line.
-
No indicator line found in file. It may be an empty file.
-
At least one mandatory section not found in MPS file.
-
An illegal line was detected in ‘’ section.
This is neither a comment nor a valid data line.
-
Unknown inequality key ‘’ in ROWS section.
Expected ‘N’, ‘G’, ‘L’ or ‘E’
-
Empty ROWS section.
Neither the objective row or the constraints were defined.
-
The supplied name, in or in OBJNAME, of the objective row was not found in amongst Free rows in the ROWS section.
-
The supplied name, in , of the BOUNDS set to be used was not found in the BOUNDS section.
-
The supplied name, in , of the RHS set to be used was not found in the RHS section.
-
The supplied name, in , of the RANGES set to be used was not found in the RANGES section.
-
Illegal row name.
Row names must consist of printable characters only.
-
Illegal column name.
Column names must consist of printable characters only.
-
Row name ‘’ has been defined more than once in the ROWS section.
-
Column ‘
’ has been defined more than once in the COLUMNS section. Column definitions must be continuous. (See
Section 3.1.5).
-
Found another ‘’ marker within ‘’‘’ range.
-
Found ‘’ marker without previous marker being ‘’.
-
Found ‘’ but not ‘’ before the end of the COLUMNS section.
-
Illegal marker type ‘’.
Should be either ‘INTORG’ or ‘INTEND’.
-
Unknown row name ‘’ in section.
All row names must be specified in the ROWS section.
-
Unknown column name ‘’ in section.
All column names must be specified in the COLUMNS section.
-
Unknown bound type ‘’ in BOUNDS section.
-
More than one nonzero of
A has row name ‘
’ and column name ‘
’ in the COLUMNS section.
-
Field
did not contain a number (see
Section 3).
-
On entry, .
Constraint: .
-
Dynamic memory allocation failed.
7 Accuracy
Not applicable.
None.
9 Example
This example solves the quadratic programming problem
where
The optimal solution (to five figures) is
Three bound constraints and two general linear constraints are active at the solution. Note that, although the Hessian matrix is only positive semidefinite, the point
is unique.
The MPS representation of the problem is given in
Section 9.2.
9.1 Program Text
Program Text (e04mxfe.f90)
9.2 Program Data
Program Options (e04mxfe.opt)
9.3 Program Results
Program Results (e04mxfe.r)