![]() |
![]() |
|
![]() |
Manipulating Strings with ObjectPAL © 2002 Al Breveleri Previous Section: Part 3: Parsing Grammatically 4. Replacing Parts One of the most common string operations is to replace every instance of a specified substring with a substitute substring. Here are several approaches to this delightful practice. 4.1. Working In a String Variable This algorithm is the fastest but will crash if the string represented by 'XXX' is a substring of that represented by 'YYY'. Listing 10: Replacing all occurrences of the constant string 'XXX' with the constant string 'YYY' proc REPL_XXX_YYY( const asSUBJ string ) string var psTEMP, psBGN, psEND string endvar psTEMP = asSUBJ while psTEMP'advMatch("^(..)XXX(..)$",psBGN,psEND) psTEMP = psBGN + "YYY" + psEND endwhile return psTEMP endprocThis algorithm works correctly even if 'X' is a substring of 'YYY' but the match string can be only one char long. Listing 11: Replacing all occurrences of the constant string 'X' with 'the constant string YYY' where 'X' is one character proc REPL_X_YYY( const asSUBJ string ) string var prTOK array [] string psTEMP string II longint endvar breakApart(asSUBJ+"X",prTOK,"X") psTEMP = prTOK[1] for II from 2 to size(prTOK) psTEMP = psTEMP + "YYY" + prTOK[II] endfor return psTEMP endprocThe fastest way to replace all matches in the general case is with a recursive procedure. Listing 12: Replacing all occurrences of the constant string 'XXX' with the constant string 'YYYXXXZZZ' for the general case, where 'XXX' is more than one character and 'XXX' is a substring of 'YYYXXXZZZ' proc REPL_XXX_YYYXXXZZZ( const asSUBJ string ) string var psBGN, psEND string endvar if asSUBJ'advMatch("^(..)XXX(..)$",psBGN,psEND) then return REPL_XXX_YYYXXXZZZ(psBGN) + "YYYXXXZZZ" + REPL_XXX_YYYXXXZZZ(psEND) else return asSUBJ endif endprocOnly the general case can safely accept arbitrary strings as match and replacement arguments. Listing 13: Replacing all occurrences of <asOLD> with <asNEW> in the general case, where <asOLD> might be more than one character and <asOLD> might be a subset of <asNEW> proc REPL_STR_1( const asSUBJ string, const asOLD string, const asNEW string ) string var psBGN, psEND string endvar if asSUBJ'advMatch("^(..)"+asOLD+"(..)$", psBGN,psEND) then return REPL_STR_1(psBGN,asOLD,asNEW) + asNEW + REPL_STR_1(psEND,asOLD,asNEW) else return asSUBJ endif endprocBoth execution time and stack space can be saved by not passing the unvarying 'asOLD' and 'asNEW' parameters to each recursive call. Listing 14: Slightly faster version of above 4.2. Working In a TextStream File Replacement in a textstream file can only be effected by copying the file, except in the very special case (not discussed here) where the match string and the replacement string are exactly the same length. Listing 15: General case substring replacer for textstream (copies tsSRC to tsDST) ; assuming gtsSRC has been opened globally ; as the input textstream and gtsDST has been created ; globally as the output textstream proc REPL_FILE( const asOLD string, const asNEW string ) var psBFFR string piANCHOR, piFNDBGN, piFNDEND longint piREMAINING longint endvar piFNDEND = 1 ; start from beginning of input file while true piANCHOR = piFNDEND piFNDBGN = piANCHOR if not gtsSRC'advMatch(piFNDBGN,piFNDEND,asOLD) then ; If a match is found, advMatch sets piFNDBGN to ; point to the first char of the match and piFNDEND ; to point to the first char after it. If a match ; is not found, the next two statements set piFNDBGN ; and piFNDEND to point to the first char after the ; end of the file. piFNDBGN = size(gtsSRC)+1 piFNDEND = piFNDBGN endif ; Now, piANCHOR, piFNDBGN, and piFNDEND can be ; compared to determine what was found, and what ; action should be taken: ; case piFNDBGN piFNDEND action ; ---------- ---------- ---------- ---------- ; no text, no match = piANCHOR = piFNDBGN quit (end of file) ; no text but match = piANCHOR end of match process match ; text but no match end of text = piFNDBGN process unmatched text ; text and match end of text end of match process text and match ; ---------- ---------- ---------- ---------- ; quit if piFNDBGN=piANCHOR and piFNDEND=piFNDBGN if piFNDEND=piANCHOR then quitloop endif if piFNDBGN<>piANCHOR then ; unmatched text found ; copy the unmatched text gtsSRC'setPosition(piANCHOR) piREMAINING = piFNDBGN-piANCHOR while piREMAINING>0 gtsSRC'readChars(psBFFR, int(min(piREMAINING,32767))) gtsDST'writeString(psBFFR) piREMAINING = piREMAINING-32767 endwhile endif if piFNDEND<>piFNDBGN then ; match found ; replace the matched text gtsDST'writeString(asNEW) endif endwhile endproc 5. Building Long Strings 5.1. Building Into a String Variable Listing 16: Concatenating the contents of a string array gasTOKENS[] ; assuming gasTOKENS is a string array to be ; concatenated together proc CONCAT_TOKENS_1() string var psRESULT string II longint endvar psRESULT = blank() for II from 1 to size(gasTOKENS) psRESULT = psRESULT + gasTOKENS[II] endfor return psRESULT endprocThe ObjectPAL RTL string package apparently creates or keeps a pool of 4KB string buffers for operations such as substring extraction and string concatenation. Operations involving a string longer than 4KB involve one or more extra trips to the Windows global dynamic memory manager which IS WRITTEN IN 16 BIT BASIC and is not very efficient. The actual limit seems to be 4095 chars, probably to allow for an empty string. This program concatenates smaller strings into a temporary buffer, and concatenates with the longer string only when the next buffer concatenation would have been over 4095 chars anyway. Listing 17: This is faster in the general case, where the ultimate result may be longer than 4095 chars ; assuming gasTOKENS is a string array to be ; concatenated together proc CONCAT_TOKENS_2() string var psRESULT, psBUFFER string piBFFRSIZE, piITEMSIZE, II longint endvar psRESULT = blank() psBUFFER = blank() piBFFRSIZE = 0 for II from 1 to size(gasTOKENS) piITEMSIZE = size(gasTOKENS[II]) if (piBFFRSIZE+piITEMSIZE)>4095 then if piITEMSIZE>4095 then psRESULT = psRESULT + psBUFFER + gasTOKENS[II] psBUFFER = blank() piBFFRSIZE = 0 else psRESULT = psRESULT + psBUFFER psBUFFER = gasTOKENS[II] piBFFRSIZE = piITEMSIZE endif else psBUFFER = psBUFFER + gasTOKENS[II] piBFFRSIZE = piBFFRSIZE+piITEMSIZE endif endfor return psRESULT + psBUFFER endprocListing 18: Similar to above but adds a specified separator string ; assuming gasTOKENS is a string array to be ; concatenated together proc MENDTOGETHER( const asSEP string ) string var psRESULT, psBUFFER string piBFFRSIZE, piSEPSIZE, piITEMSIZE, II longint endvar psRESULT = blank() psBUFFER = blank() piBFFRSIZE = 0 piSEPSIZE = size(asSEP) if size(gasTOKENS)>0 then psBUFFER = gasTOKENS[1] endif for II from 2 to size(gasTOKENS) piITEMSIZE = size(gasTOKENS[II]) if (piBFFRSIZE+piSEPSIZE+piITEMSIZE)>4095 then if (piSEPSIZE+piITEMSIZE)>4095 then if piITEMSIZE>4095 then psRESULT = psRESULT+psBUFFER+asSEP+gasTOKENS[II] psBUFFER = blank() piBFFRSIZE = 0 else psRESULT = psRESULT + psBUFFER + asSEP psBUFFER = gasTOKENS[II] piBFFRSIZE = piITEMSIZE endif else psRESULT = psRESULT + psBUFFER psBUFFER = asSEP + gasTOKENS[II] piBFFRSIZE = piSEPSIZE+piITEMSIZE endif else psBUFFER = psBUFFER + asSEP + gasTOKENS[II] piBFFRSIZE = piBFFRSIZE+piSEPSIZE+piITEMSIZE endif endfor return psRESULT + psBUFFER endproc5.2. Building Into a Textstream File If your processor speed, RAM size, and disk size are roughly in balance, or if you have increased your file buffer pool size, it is faster to build a long string in a text file using textstream than it is to build it in memory using a string variable. If you are constructing a long string that will end up in a file anyway, try to design your program to create the file directly. This is the usual case when emitting dynamic web pages. The syntax for writeString() allows you cite an array in the arg list to the writeString() method. This will write out all the elements in the array, but each one will be written on a separate line. To fully control the concatenation of the elements, you must iterate the array. Listing 19: Concatenating the contents of a string array gasTOKENS[] directly into a text file ; assuming gasTOKENS is a string array to be ; concatenated together assuming gtsDST has been opened ; globally as the output textstream proc CONCAT_TOKENS_WRITE() var II longint endvar for II from 1 to size(gasTOKENS) gtsDST'writeString( gasTOKENS[II] ) endfor endprocUse this code if you want to add a separator character between the elements. Listing 20: Similar to above but adds a specified separator string ; assuming gasTOKENS is a string array to be ; concatenated together assuming gtsDST has been opened ; globally as the output textstream proc MENDTOGETHER_WRITE( const asSEP string ) var II longint endvar gtsDST'writeString( gasTOKENS[1] ) for II from 2 to size(gasTOKENS) gtsDST'writeString( asSEP, gasTOKENS[II] ) endfor endproc Discussion of this article |
![]() Feedback | Paradox Day | Who Uses Paradox | I Use Paradox | Downloads ![]() |
|
![]() The information provided on this Web site is not in any way sponsored or endorsed by Corel Corporation. Paradox is a registered trademark of Corel Corporation. ![]() |
|
![]() Modified: 15 May 2003 Terms of Use / Legal Disclaimer ![]() |
![]() Copyright © 2001- 2003 Paradox Community. All rights reserved. Company and product names are trademarks or registered trademarks of their respective companies. Authors hold the copyrights to their own works. Please contact the author of any article for details. ![]() |
![]() |
|