e04mzc reads data for a sparse linear programming or quadratic programming problem from a file which is in standard or compatible MPSX input format.
Note that e04mzc is particularly suitable for use in conjunction with the quadratic programming function e04nkc. For reasons of efficiency, new users are recommended to use the function pair e04mxc/e04nqc instead.
The function may be called by the names: e04mzc, nag_opt_qpconvex1_sparse_mps or nag_opt_sparse_mps_read.
3Description
e04mzc reads Linear Programming (LP) or Quadratic Programming (QP) problem data from a file which is prepared in standard or compatible MPSX input format and then initializes (the number of variables), (the number of general linear constraints), the matrix , and the vectors , and (stored in row iobj of ) for use with e04nkc, which is designed to solve problems of the form
(1)
For LP problems, . For QP problems, a function must be provided to e04nkc to compute for any given vector . (This is illustrated in Section 10.) The optional parameter may be used to specify whether the objective function is to be minimized or maximized. The document for e04nkc should be consulted for further details.
Since, in general, the exact size of the problem defined by an MPSX file may not be known in advance, the arrays returned by e04mzc are all allocated internally.
MPSX Input Format
The MPSX data file may only contain two types of line:
1.Indicator lines (specifying the type of data which is to follow).
2.Data lines (specifying the actual data).
The input file must not contain any blank lines. Any characters beyond column 80 are ignored. Indicator lines must not contain leading blank characters (in other words they must begin in column 1). The following displays the order in which the indicator lines must appear in the file:
NAME user-supplied name
ROWS
data line(s)
COLUMNS
data line(s)
RHS
data line(s)
RANGES (optional)
data line(s)
BOUNDS (optional)
data line(s)
ENDATA
The ‘user-supplied name’ specifies a name for the problem and must occupy columns 15–. The name can either be blank or up to a maximum of 8 characters.
A data line follows the same fixed format made up of fields defined below. The contents of the fields may have different significance depending upon the section of data in which they appear.
Field 1
Field 2
Field 3
Field 4
Field 5
Field 6
Columns
2–3
5–12
15–22
25–36
40–47
50–61
Contents
Code
Name
Name
Value
Name
Value
The names and codes consist of ‘alphanumeric’ characters (i.e., az, AZ, , , , asterisk (*), blank ( ), colon (:), dollar sign ($) or full stop (.) only) and the names must not contain leading blank characters. Values may be entered in several equivalent forms. For example, , , and all represent the same number. It is safest to include an explicit decimal point. Note that the lower case ‘e’ exponential notation is not standard MPSX, and if compatibility with other MPSX readers is required then the upper case notation should be used. The lower case notation is supported by e04mzc since this is the natural notation in a C programming language environment.
It is recommended that numeric values be right-justified in the 12-character field, with no trailing blanks. This is to ensure compatibility with other MPSX readers, some of which may, in certain situations, interpret trailing blanks as zeros. This can dramatically affect the interpretation of the value and is relevant if the value contains an exponent, or if it contains neither an exponent nor an explicit decimal point.
Comment lines are allowed in the data file. These must have an asterisk (*) in column 1 and any characters in columns 2–. In any data line, a dollar sign ($) as the first character in Field 3 or 5 indicates that the information from that point through column 80 consists of comments.
Columns outside the six fields must be blank, except for columns 72–, whose contents are ignored by the function. These columns may be used to enter a sequence number. A non-blank character outside the predefined six fields and columns 72–80 is considered to be a major error unless it is part of a comment.
ROWS Data Lines
These lines specify row (constraint) names and their inequality types (i.e., , or ).
Field 1:
defines the constraint type as follows (may be in column 2 or column 3):
N
free row, i.e., no constraint. It may be used to define the objective row.
G
greater than or equal to (i.e., ).
L
less than or equal to (i.e., ).
E
exactly equal to (i.e., ).
Field 2:
defines the row name.
Row type N stands for ‘Not binding’, also known as ‘Free’. It can be used to define the objective row. The objective row is a free row that specifies the vector in the linear objective term . It is taken to be the first free row, unless some other free row name is specified by the optional parameter (see Section 11.2). Note that is assumed to be zero if (for example) the line
%N%%DUMMYROW
(where % denotes a blank) appears in the ROWS section of the MPSX data file, and the row name DUMMYROW is omitted from the COLUMNS section.
COLUMNS Data Lines
These lines specify the names to be assigned to the variables (columns) in the general linear constraint matrix , and define, in terms of column vectors, the actual values of the corresponding matrix elements.
Field 1:
blank (ignored).
Field 2:
gives the name of the column associated with the elements specified in the following fields.
Field 3:
contains the name of a row.
Field 4:
used in conjunction with Field 3; contains the value of the matrix element.
Field 5:
is optional (may be used like Field 3).
Field 6:
is optional (may be used like Field 4).
Note that only the nonzero elements of and need to be specified in the COLUMNS section, as any zero elements of are removed and any unspecified elements of are assumed to be zero. In addition, any nonzero elements in the th column of must be grouped together before those in the th column, for . Nonzero elements within a column may however appear in any order.
RHS Data Lines
This section specifies the right-hand side values of the general linear constraint matrix (if any). The lines specify the name to be given to the right-hand side (RHS) vector along with the numerical values of the elements of the vector, which may appear in any order. The data lines have exactly the same format as the COLUMNS data lines, except that the column name is replaced by the RHS name. Only the nonzero elements need be specified. Note that this section may be empty, in which case the RHS vector is assumed to be zero.
RANGES Data Lines (optional)
Ranges are used for constraints of the form , where both and are finite. The effect of specifying a range for constraint depends on the type of the constraint (i.e., G, L or E), the sign of , and the bound associated with the constraint in the RHS section. (Recall that this bound is taken to be zero if the constraint has no entry in the RHS section.) The various possibilities may be summarised as follows.
Row Type
Sign of
Bound from RHS
Resultant
Resultant
G
or
L
or
E
E
The data lines have exactly the same format as the COLUMNS data lines, except that the column name is replaced by the RANGE name.
BOUNDS Data Lines (optional)
These lines specify limits on the values of the variables ( and in ). If the variable is not specified in the bound set then it is automatically assumed to lie between default lower and upper bounds (usually 0 and ). (These default bounds may be reset to the values specified by the optional parameters and ; see Section 11.2.) Like an RHS column which is given a name, the set of variables in one bound set is also given a name.
Field 1:
specifies the type of bound or defines the variable type as follows:
LO
lower bound.
UP
upper bound.
FX
fixed variable.
FR
free variable ( to ).
MI
lower bound is .
PL
upper bound is . This is the default variable type.
Field 2:
identifies a name for the bound set.
Field 3:
identifies the column name of the variable belonging to this set.
Field 4:
identifies the value of the bound; this has a numerical value only in association with LO, UP, FX in Field 1, otherwise it is blank.
Field 5:
is blank and ignored.
Field 6:
is blank and ignored.
Note that if RANGES and BOUNDS sections are both present, the RANGES section must appear first.
MPSX and Integer Programming Problems
The MPSX input format allows the specification of integer programming (IP) problems in which some or all of the variables are constrained to take integer values within a specified range. e04mzc can read MPSX files defining IP problems in either the ‘compatible’ or ‘standard’ formats. However, any integer restrictions are ignored: any variable upon which such restrictions are defined by the file is simply treated as a continuous variable with upper and lower bounds as specified. The facility to read such files is offered to allow users to solve IP problems in their ‘relaxed’ LP or QP form using e04nkc. The compatible and standard MPSX forms are described below. If you are not interested in this facility you may skip the remainder of this section.
In the compatible MPSX format, the type of integer variables are defined in Field 1 of the BOUNDS section, that is:
Field 1:
specifies the type of the integer variable as follows:
BV
0–1 integer variable (bound value is ).
UI
general integer variable (bound value is in Field 4).
In the standard MPSX format, the integer variables are treated the same as ‘ordinary’ bounded variables, in the BOUNDS section. Integer markers are, however, introduced in the COLUMNS section to specify the integer variables. The indicator lines for these markers are:
Field 1
Field 2
Field 3
Field 4
Field 5
Field 6
Columns
2–3
5–12
15–22
25–36
40–47
50–61
Contents
name
'MARKER'
'INTORG'
to mark the beginning of the integer variables and
Field 1
Field 2
Field 3
Field 4
Field 5
Field 6
Columns
2–3
5–12
15–22
25–36
40–47
50–61
Contents
name
'MARKER'
'INTEND'
to mark the end. That is, any variables between these markers are treated as integer variables. The name in Field 2 may be any name different from the preceding and following column names, the other entries in the indicator lines must be exactly as described above (including quotation marks). Note that if the INTEND indicator line is not specified then all columns between the INTORG indicator line and the end of the COLUMNS section are assumed to be integer variables. e04mzc accepts both standard and/or compatible MPSX format as a means of specifying integer variables.
4References
IBM (1971) MPSX – Mathematical programming system Program Number 5734 XM4 IBM Trade Corporation, New York
5Arguments
1: – const char *Input
On entry: the name of the MPSX data file. If mps_file is a null pointer or null string, then the data is assumed to come from stdin.
2: – Integer *Output
On exit: the number of columns (variables) specified by the data file.
3: – Integer *Output
On exit: the number of rows specified by the data file. This is the number of general linear constraints in the problem, including the objective row.
4: – Integer *Output
On exit: the number of nonzeros in the problem (including the objective row).
5: – Integer *Output
On exit: if , row iobj of is a free row containing the nonzero coefficients of the vector (the rows of are indexed ). If , the coefficients of are assumed to be zero.
6: – double **Output
On exit: the nnz nonzero elements of , ordered by increasing column index.
Sufficient memory is allocated internally by e04mzc and may be freed by the utility function e04myc.
7: – Integer **Output
On exit: the nnz row indices of the nonzero elements of .
Sufficient memory is allocated internally by e04mzc and may be freed by the utility function e04myc.
8: – Integer **Output
On exit: the indices indicating the beginning of each column of in a. More precisely, contains the index in a of the start of the th column, for . Note that and .
Sufficient memory is allocated internally by e04mzc and may be freed by the utility function e04myc.
9: – double **Output
10: – double **Output
On exit: bl and bu hold the lower bounds and upper bounds, respectively, for all the variables and constraints, in the following order. The first n elements contain the bounds on the variables and the next m elements contain the bounds for the linear objective term and the general linear constraints (if any). Note that an ‘infinite’ lower bound is indicated by , an ‘infinite’ upper bound by , and an equality constraint by . (The lower bound for , stored in , is set to , and the upper bound, stored in is set to ; the optional parameter has a default value of ; see Section 11.)
Sufficient memory is allocated internally by e04mzc and may be freed by the utility function e04myc.
11: – double **Output
On exit: a set of initial values for the n variables and m constraints in the problem. More precisely, , for .
Sufficient memory is allocated internally by e04mzc and may be freed by the utility function e04myc.
12: – Nag_E04_Opt *Input/Output
On entry/exit: a pointer to a structure of type Nag_E04_Opt whose members are optional parameters for e04mzc. These structure members offer the means of adjusting the argument values used when reading in the MPSX file and on output will supply further details of the results. A description of the members of options is given below in Section 11.2.
If any of these optional parameters are required then the structure options should be declared and initialized by a call to e04xxc and supplied as an argument to e04mzc. However, if the optional parameters are not required the NAG defined null pointer, E04_DEFAULT, can be used in the function call.
13: – NagError *Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).
6Error Indicators and Warnings
NE_2_REAL_EE_OPT_ARG_CONS
On entry, while . Constraint: .
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument had an illegal value.
On entry, argument had an illegal value.
On entry, argument had an illegal value.
On entry, argument had an illegal value.
On entry, argument had an illegal value.
On entry, argument had an illegal value.
NE_INT_OPT_ARG_LT
On entry, . Constraint: .
On entry, . Constraint: .
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_INVALID_REAL_RANGE_F
Value given to is not valid. Correct range is .
NE_MPS_ENDATA_NOT_FOUND
The file does not contain an ENDATA indicator.
NE_MPS_ILLEGAL_DATA_LINE
An illegal data line has been read from the MPSX file. This is neither a comment nor a legal data line.
Error at MPSX line : .
NE_MPS_ILLEGAL_NAME
An illegal row or column name has been detected. Names must contain only alphanumeric characters with no leading blanks.
Error at MPSX line : .
NE_MPS_ILLEGAL_NUMBER
Number expected but value could not be read. Check numerical fields.
Error at MPSX line : .
NE_MPS_ILLEGAL_SETNAME
An illegal name has been detected in Field 2 of the RHS, RANGES or BOUNDS section.
Names must contain only alphanumeric characters with no leading blanks.
Error at MPSX line : .
NE_MPS_INVALID_BND_TYPE
An invalid bound type appears in the BOUNDS section.
Expect: LO, UP, FX, FR, MI, PL, BV or UI.
Error at MPSX line : .
NE_MPS_INVALID_BND_VAL
Invalid numeric field in bound data. Value expected for types: LO, UP, FX, UI. Blank field expected for types: FR, MI, PL, BV.
Error at MPSX line : .
NE_MPS_INVALID_INDICATOR
Unknown, unexpected or invalid indicator line read. Expect: NAME, ROWS, COLUMNS, RHS, RANGES, BOUNDS or ENDATA, starting in column 1 of file, and in that order. RANGES and/or BOUNDS may be omitted. Error at MPSX line : .
NE_MPS_INVALID_INTORG_INTEND
An INTORG or INTEND marker is not correctly specified or is unexpected (e.g., INTEND has no matching INTORG).
Error at MPSX line : .
NE_MPS_INVALID_ROW_TYPE
An invalid row type appears in the ROWS section. Expect: N, G, L or E.
Error at MPSX line : .
NE_MPS_NO_COLS
There were no columns specified in the COLUMNS section.
Last MPSX line read (): .
NE_MPS_NO_NEWLINE
New line expected but not found.
Last MPSX line read (): .
NE_MPS_NO_OBJ
The objective row was not found. There must be at least one row of type N in the ROWS section and, if an objective name was specified, there must be a type N row with this name. Last MPSX line read (): .
NE_MPS_NO_ROWS
There were no rows specified in the ROWS section.
Last MPSX line read (): .
NE_MPS_PROB_NOT_FOUND
The specified problem has not been found in the MPSX file.
NE_MPS_REPEAT_ROW
A row has been specified more than once.
Error at MPSX line : .
NE_MPS_RHS_RANGE_BND_NOT_FOUND
The name of the RHS, RANGES or BOUNDS set to be used was not found in the file.
NE_MPS_SPLIT_COL
Column data is not contiguous. All entries for a given column must appear together in the COLUMNS section.
Error at MPSX line : .
NE_MPS_UNKNOWN_COLNAME
An unknown column name appears in the BOUNDS section. All the column names must be specified in the COLUMNS section.
Error at MPSX line : .
NE_MPS_UNKNOWN_ROWNAME
An unknown row name appears in the section. All the row names must be specified in the ROWS section.
Error at MPSX line : .
NE_NAMES_NOT_NAG_MEM
is not null but does not point to memory allocated by an earlier call to this function. This function does not accept user-allocated memory assigned to .
NE_NOT_APPEND_FILE
Cannot open file for appending.
NE_NOT_CLOSE_FILE
Cannot close file .
NE_NOT_READ_FILE
Cannot open file for reading.
NE_NULL_ARGUMENT
Argument a is a null pointer. It should contain the address of a variable of type double *.
Argument bl is a null pointer. It should contain the address of a variable of type double *.
Argument bu is a null pointer. It should contain the address of a variable of type double *.
Argument ha is a null pointer. It should contain the address of a variable of type Integer *.
Argument iobj is a null pointer. It should contain the address of a variable of type Integer.
Argument ka is a null pointer. It should contain the address of a variable of type Integer *.
Argument m is a null pointer. It should contain the address of a variable of type Integer.
Argument n is a null pointer. It should contain the address of a variable of type Integer.
Argument nnz is a null pointer. It should contain the address of a variable of type Integer.
Argument xs is a null pointer. It should contain the address of a variable of type double *.
NE_OPT_NOT_INIT
Options structure not initialized.
NE_WRITE_ERROR
Error occurred when writing to file .
7Accuracy
Not applicable.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
e04mzc is not threaded in any implementation.
9Further Comments
None.
10Example
There is one example program file, the main program of which calls both examples ex1 and ex2. Example 1 (ex1) shows the simple use of e04mzc where default values are used for all optional parameters, in conjunction with e04nkc. An example showing the use of optional parameters is given in ex2 and is described in Section 12.
Example 1 (ex1)
To solve the quadratic programming problem
where
The optimal solution (to five significant figures) is
Three bound constraints and two general linear constraints are active at the solution. Note that, although the Hessian is positive semidefinite, the point is the unique solution.
The function to calculate (required by e04nkc) is qphess for this example.
Note the use of e04myc in this example to free the memory returned by e04mzc, once the problem has been solved.
Note also the memory freeing function e04xzc is used to free the memory assigned to the pointers in the options structure. You must not use the standard C function free() for this purpose.
The MPSX representation of the problem is given in Section 10.2.
A number of optional input and output arguments to e04mzc are available through the structure argument options, type Nag_E04_Opt. An argument may be selected by assigning an appropriate value to the relevant structure member; those arguments not selected will be assigned default values. If no use is to be made of any of the optional parameters you should use the NAG defined null pointer, E04_DEFAULT, in place of options when calling e04mzc; the default settings will then be used for all arguments.
Before assigning values to options directly the structure must be initialized by a call to the function e04xxc. Values may then be assigned to the structure members in the normal C manner.
After return from e04mzc, the options structure may only be re-used for future calls of e04mzc if the dimensions of the new problem are the same. Otherwise, the structure must be cleared by a call of e04xzc) and re-initialized by a call of e04xxc before future calls. Failure to do this will result in unpredictable behaviour.
Option settings may also be read from a text file using the function e04xyc in which case initialization of the options structure will be performed automatically if not already done. Any subsequent direct assignment to the options structure must not be preceded by initialization.
11.1Optional Parameter Checklist and Default Values
For easy reference, the following list shows the members of options which are valid for e04mzc together with their default values where relevant.
Boolean list
Nag_TRUE
Nag_OutputType output_level
Nag_MPS_Summary
char outfile[512]
stdout
char prob_name[9]
''
char obj_name[9]
''
char rhs_name[9]
''
char range_name[9]
''
char bnd_name[9]
''
double col_lo_default
0.0
double col_up_default
Integer ncol_approx
500
Integer nrow_approx
500
double est_density
0.05
char **crnames
size
11.2Description of the Optional Parameters
list – Nag_Boolean
Default
On entry: if the argument settings in the call to e04mzc will be printed.
output_level – Nag_OutputType
Default
On entry: the level of printout produced by e04mzc. The following values are available:
No output.
A summary of the dimensions of the problem read and a list of the ‘MPSX names’ (problem name, objective row name, etc.).
As but each line of the MPSX file is echoed as it is read. This can be useful for debugging the file.
Constraint:
, or .
outfile – const char[512]
Default
On entry: the name of the file to which results should be printed. If then the stdout stream is used.
prob_name – char
Default:
obj_name – char
Default:
rhs_name – char
Default:
range_name – char
Default:
bnd_name – char
Default:
On entry: these options contain the names associated with the MPSX form of the problem. These names must be specified as follows:
must contain the name of the problem to be read or be blank. The problem name is specified in the NAME indicator line (see Section 3) and if is not blank, then e04mzc will search the MPSX file for the specified problem. If is blank, then the first problem encountered will be read.
must contain the name of the objective row or be blank (in which case the first objective free row is used).
must contain the name of the RHS set to be used or be blank (in which case the first RHS set is used).
must contain the name of the RANGES set to be used or be blank (in which case the first RANGES set, if any, is used).
must contain the name of the BOUNDS set to be used or be blank (in which case the first BOUNDS set, if any, is used).
Constraint:
the names must be valid MPSX names, i.e., they must consist only of the ‘alphanumeric’ characters as specified in Section 3 and must not contain leading blank characters.
On exit: the members contain the appropriate names as read from the MPSX file. Any names specified on input which are not found in the MPSX file are unchanged on exit but will give rise to an error exit from e04mzc (see Section 6).
If the MPSX file is successfully read, the options structure can be passed on to e04nkc, which will solve the problem specified by the file and which can make use of these structure members in its solution output.
col_lo_default – double
Default
On entry: the default lower bound to be used for the variables in the problem when none is specified in the BOUNDS section of the MPSX data file.
col_up_default – double
Default
On entry: the default upper bound to be used for the variables in the problem when none is specified in the BOUNDS section of the MPSX data file.
Constraint:
.
ncol_approx – Integer
Default
nrow_approx – Integer
Default
On entry: an estimate of the number of columns and rows in the problem. e04mzc is designed so that the problem size does not have to be known in advance, and allocates memory according to the data contained in the MPSX file. However, for very large problems, an advance estimate of the problem size might allow slightly more efficient memory usage to be achieved. See also the description of optional parameter .
Constraints:
;
.
est_density – double
Default
On entry: an estimate of the density of the nonzeros in the sparse matrix , i.e., an estimate of . As with the optional parameters and , if this is known to be significantly larger or smaller than the default, then you should specify an appropriate value to aid e04mzc in its memory management.
Constraint:
.
crnames – char **
Default memory array of char *
On exit: the MPSX names of all the variables and constraints in the problem in the following order. contains the name of the th column, for . contains the name of the th row, for . Each name is eight characters long, and includes any trailing blank characters which appear in the appropriate name field of the MPSX file.
Sufficient memory to hold the names is allocated internally by e04mzc. The memory freeing function e04xzc should be used to free this memory. You should not use the standard C function free() for this purpose.
If, on return from e04mzc, e04nkc is called with options as an argument, and the memory pointed to by has not been freed, e04nkc will use the row and column names stored in in its solution output.
11.3Description of Printed Output
Results are printed out by default. The level of printed output can be controlled with the structure members and (see Section 11.2). If then the argument values to e04mzc are listed, whereas the printout of results is governed by . The default, gives the following information if the MPSX file has been read successfully:
(a)the number of lines read.
(b)the number of columns specified by the data. If any of these are specified as integer variables, the number of such variables is also reported. (However, recall that e04mzc will nevertheless regard such variables as continuous variables; see Section 3.)
(c)the number of rows specified by the data. The objective row is counted amongst these.
In addition, the names of the problem, the objective row, the RHS set, the RANGES set, and the BOUNDS set read are listed. Unless specified otherwise by the optional parameters , , , and/or (see Section 11), these names will correspond to the first problem, objective row, etc., encountered in the MPSX file. Where no set was encountered (RANGES and BOUNDS are optional), a ‘blank’ is output.
Additionally, when , each line of the MPSX file is echoed as it is read. This may be useful as a debugging aid.
If then printout will be suppressed; you can print the information contained in (b) and (c) when e04mzc returns to the calling program.
12Example 2 (ex2)
Example 2 (ex2) solves the same problem as Example 1 (ex1), described in Section 10, but shows the use of the options structure. Although the problem is the same, it is defined by a slightly modified MPSX file. The same qphess function is used as in ex1 (see Section 10).
The options structure is initialized by a call to e04xxc and two of the optional parameters are set: is set to "..QP 2.." so that e04mzc will attempt to read a problem of this name; and is set to "..COST..". The MPSX file (see Section 10.2) contains an additional free row, named "FREE ROW". Since this is the first free row in the ROWS section of the MPSX file, by default it would be read as the objective row. However, since is specified, e04mzc takes the second free row ("..COST..") as the objective row.
e04mzc is called to read the MPSX file, and this is followed by a call to e04nkc to solve the problem. As the options structure is passed as an argument, the row and column names read from the file are stored in and used in the solution output (see Section 10.3).
Finally, e04myc is called to free the problem arrays, and e04xzc is called to free the memory in options.