REMOVE Statement
Successively extracts substrings from a source string, which may be a dynamic array.
Syntax
REMOVE variable FROM source-string {TO delimiter} SETTING setting-var
Syntax Elements
variable The name of a variable to which the substring is assigned.
source-string The source string from which the substrings are extracted.
setting-var The name of a variable in which a value that corresponds to the system delimiter encountered is returned.
delimiter This can be any single character, including system delimiters, or a multi character string.
If the specified delimiter is a single system delimiter, defined as a character from ASCII 249 through ASCII 255, it is treated as the minimum system delimiter and all higher value system delimiters will automatically be included.
System delimiter range:
CHAR(255) X’FF’ Segment Mark
CHAR(254) X’FE’ Attribute Mark @AM
CHAR(253) X’FD’ Value Mark @VM
CHAR(252) X’FC’ Subvalue Mark @SVM
CHAR(251) X’FB’ Start Buffer
CHAR(250) X’FA’ System Delimiter
CHAR(249) X’F9’ System Delimiter
If the TO delimiter is not specified, the delimiter defaults to X’F9’.
Operation
The default operation of the REMOVE statement, when the TO delimiter field is not specified, is to assume the source string is a dynamic array and to extract one substring from the dynamic array each time it is executed. Extraction begins with the first substring in the array until the first system delimiter is encountered. The substring is assigned to variable, a value corresponding to the system delimiter is assigned to setting-var, and the internal REMOVE pointer is set to the beginning of the next substring.
The important feature of the REMOVE statement, which separates it from a simple extract function, is the internal REMOVE pointer. This allows successive extract operations on a string without having to provide a location reference and without the program having to search the string from the beginning with each extraction.
Each time the REMOVE statement is executed with a particular source string, the internal REMOVE pointer is moved to the next system delimiter.
You can execute the REMOVE statement again until all of the substrings in the source string have been extracted.
If the delimiter specified is a single SYSTEM delimiter character, it is treated as the minimum system delimiter and all higher system delimiters will automatically be included; for example, if a VM delimiter is specified, fields delimited by a VM or AM or SM will be returned.
The TO clause can specify a user-defined delimiter character or string in addition to the Reality system delimiters. This enables efficient parsing of external data; for example, specifying a comma character to parse comma-separated data, or a CR LF string to extract text lines.
If the delimiter specified is a single non-SYSTEM delimiter character, only fields delimited by the specified character or the 'end of item' will be returned.
If the delimiter specified is any other string, only fields delimited by the specified string or the 'end of item' will be returned, and the returned fields will not contain the specified character or string delimiter.
Note that both the source string and the TO string may contain any system delimiters including an SM, (X'FF').
The REMOVE statement does not alter the source string in any way.
You can reset the internal REMOVE pointer back to the beginning of an array by:
-
Assigning the array to itself; for example:
ARRAY.X = ARRAY.X
-
Using the REMOVE.POS() statement; for example:
REMOVE.POS(ARRAY.X) = 0
In addition you can save the current position and restore it later by use of REMOVE.POS(); for example:
LAST.GOOD = REMOVE.POS(SOURCE.STR) REMOVE SUBS FROM SOURCE.STR SETTING DELIM.TYPE IF DELIM.TYPE THEN IF SUBS = “INVALID” THEN GOTO RECOVER ~snip~ RECOVER: REMOVE.POS(SOURCE.STR) = LAST.GOOD ~snip~
The values assigned to setting-var are as follows:
Value |
System delimiter |
---|---|
0 |
End of array |
1 |
SM: segment mark (255) |
2 |
AM: attribute mark (254) |
3 |
VM: value mark (253) |
4 |
SVM: subvalue mark (252) |
5 |
SB: start buffer (251) |
6 |
250 |
7 |
249 |
8 |
User character/string |
MultiValue Compatibility
See also REMOVE.POS Function, REMOVE Statement (Multivalue).
Example 1
Y="First Order":@VM:"Second Order":@AM:"Third Order" REMOVE REC1 FROM Y SETTING VAL1 REMOVE REC2 FROM Y SETTING VAL2
The substring "First Order" is assigned to variable REC1. The value 3 is assigned to VAL1 (signifying the Value Mark encountered). The substring "Second Order" is assigned to variable REC2. The value 2 is assigned to VAL2 (signifying the attribute mark encountered).
The pointer is positioned at the attribute mark. A subsequent REMOVE statement would extract the substring "Third Order".
Example 2
LOOP REMOVE VAR FROM ARRAY.X SETTING VAL WHILE VAL DO
GOSUB EVAL
REPEAT
ARRAY.X=ARRAY.X
Successive substrings of ARRAY.X are assigned to VAR. The program calls subroutine EVAL where the data is used. The program loops until the last substring has been read. VAL is assigned a value of zero and the program exits from the loop. ARRAY.X is assigned to itself placing the pointer at the beginning of the array.
Example 3
Y= "First Order":@VM:"Second Order":@AM: "Third Order" REMOVE REC1 FROM Y TO @AM SETTING VAL1 REMOVE REC2 FROM Y TO @AM SETTING VAL2
This will return only AM and SM delimited fields, as follows:
The substring "First Order]Second Order" is assigned to variable REC1. The value 2 is assigned to VAL1 (signifying the Attribute Mark encountered).
The substring "Third Order" is assigned to variable REC2. The value 1 is assigned to VAL2 (signifying the end of string system segment mark encountered).
The pointer is positioned at the system segment mark. A subsequent REMOVE statement would extract a NULL substring with a setting value of zero.
Example 4
Y= "Alpha,Beta" REMOVE REC1 FROM Y TO "," SETTING VAL1 REMOVE REC2 FROM Y TO "," SETTING VAL2
This will return only comma delimited fields, as follows:
The substring "Alpha" is assigned to variable REC1. The value 8 is assigned to VAL1 (signifying user defined delimiter encountered).
The substring "Beta" is assigned to variable REC2. The value 1 is assigned to VAL2 (signifying the end of string system segment mark encountered).
The pointer is positioned at the system segment mark. A subsequent REMOVE statement would extract a NULL substring with a setting value of zero.
Example 5
CRLF = CHAR(13):CHAR(10) Y= "line 1:CRLF:"LINE 2" REMOVE REC1 FROM Y TO CRLF SETTING VAL1 REMOVE REC2 FROM Y TO CRLF SETTING VAL2
This will return individual lines delimited by a CR LF sequence, as follows:
The substring "Line 1" is assigned to variable REC1. The value 8 is assigned to VAL1 (signifying user defined delimiter encountered).
The substring "Line 2" is assigned to variable REC2. The value 1 is assigned to VAL2 (signifying the end of string system segment mark encountered).
The pointer is positioned at the system segment mark. A subsequent REMOVE statement would extract a NULL substring with a setting value of zero.