Paradox Community
Search:

 Welcome |  What is Paradox |  Paradox Folk |  Paradox Solutions |
 Interactive Paradox |  Paradox Programming |  Internet/Intranet Development |
 Support Options |  Classified Ads |  Wish List |  Submissions 


Paradox Programming Articles  |  Beyond Help Articles  |  Tips & Tricks Articles  


Base64 Encoding
© 2003 Rick Kelly
www.crooit.com

Preface

A library and test script of all OPAL methods presented is available here.


Overview

Base64 is a reversible encoding method that converts 8-bit data into 7-bit ASCII text. Each three bytes of the original data are divided into four 6-bit blocks that are represented by four 7-bit ASCII characters. This usually enlarges the file size by one third. It is used to transmit non-text files over Internet email and is typically used for mail attachments.

The base64 alphabet contains 64 characters plus an equal sign ("=") that is used to indicate null characters in the last encoding block.

Value Encoding Value Encoding Value Encoding Value Encoding
0 A 17 R 34 i 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
3 D 20 U 37 l 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6 G 23 X 40 o 57 5
7 H 24 Y 41 p 58 6
8 I 25 Z 42 q 59 7
9 J 26 a 43 r 60 8
10 K 27 b 44 s 61 9
11 L 28 c 45 t 62 +
12 M 29 d 46 u 63 /
13 N 30 e 47 v    
14 O 31 f 48 w (pad) =
15 P 32 g 49 x    
16 Q 33 h 50 y    

As an example of the encoding process, let's take a word familiar to all serious OPAL adherents.

Liz

Ansicodes: 76 (L), 105 (i), 122 (z) or hex codes 4C, 69, 7A

In binary form this looks like:
01001100 01101001 01111010

Re-grouping these 24 bits into 4 groups of 6 bits yields:

010011 000110 100101 111010

or, in hex (padding each 6 bit group with two extra zeros on the left)

13, 6, 25, 3A

or decimal

19, 6, 37, 58

Using the decimal values as indices into our base64 alphabet gives us:

TGl6

Special processing is performed if fewer than 24 bits are available at the end of the String Type being encoded as base64 encoding always results in groups of 4 characters. When fewer than 24 input bits are available, zero bits are added (on the right) to form four 6-bit groups. Padding at the end of the encoding result is performed using the '=' character. Since all base64 encoding is always performed on three 8-bit groups, only the following cases can results:
  1. The final input encoding group is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of four characters with no "=" padding.
  2. The final input encoding group is exactly 8 bits or one character; here, the final unit of encoded output will be two characters followed by two "=" padding characters.
  3. The final input encoding group is exactly 16 bits or two characters; here, the final unit of encoded output will be three characters followed by one "=" padding character.
Any characters outside of the base64 alphabet are ignored in base64-encoded data. The same applies to any illegal sequence of characters in the base64 encoding, such as "=====".

As an example, let us take another well known word such as Tony, and walk through its base64 encoding.

Using the same steps previously presented, the first input encoding group is:

Ton which encodes to VG9u

The second input encoding group is the single character:

y

Following the rule in (2) above:

y = x'79' or

01111001

Re-grouping these 8 bits into two six bit groups and extending or padding the second group with zeros looks like:

011110 010000

or, in hex (padding each 6 bit group with two extra zeros on the left)

1E, 10

or decimal

30, 16

Using the decimal values as indices into our base64 alphabet us and padding with two "=" characters gives yields:

eQ==

The total base64 encoded value for Tony is:

VG9ueQ==

As one can notice, the output of base64 encoding is always 4 characters for every 3 input characters or approximately a 33% increase.

Decoding is straight forward and it the exact reverse of the encoding process taking care to ignore any characters not contained in our base64 alphabet.

One final note about the encoding process: the encoded output must be represented in lines of no more than 76 characters each before a CR/LF must be inserted in the encoded output stream.


OPAL Methods and Procedures

Encoding

Breaking three 8-bit groups into four 6-bit groups involves shifting bits around. The basic steps are:
  1. Use first 6 bits of first character
  2. Use last 2 bits of first character + first 4 bits of second character
  3. Use last 4 bits of second character + first 2 bits of third character
  4. Use last 6 bits of third character
One way of isolating bit groups is to multiply and divide by 2n where n is equal to the number of positions to shift left or right respectively.

To isolate the first 6 bits of the first character, we need to shift the bits to the right two positions or divide by 4 and then force the two left most bits to zero just to be safe. For example (assume integer arithmetic) using Liz for the input block as previously discussed:

Input:

01001100 or decimal 76

76 / 4 = 19 or 00010011

To use the last 2 bits of first character plus first 4 bits of second character involves:
  1. Shift the first character left 4 bit positions using multiplication by 16 after clearing the first 6 bits
  2. Shift the second character right 4 bit position using division by 16
  3. Add the results of steps 1 and 2 together
For example:

Input:

01001100 01101001 or decimal 76 and 105

Clearing the first 6 bits of the first character gives us 00000000 or 0 when multiplied by 16 = 0

105 / 16 = 6; ignore remainder

0 + 6 = 6 or 00000110

To use the last 4 bits of the second character plus the first 2 bits of the third character involves:
  1. Turn off the first 4 bits of the second character and shift the result left 2 positions using multiplication by 4
  2. Shift the third character right 6 bit positions using division by 64
  3. Add the results of step 1 and 2 together
For example:

Input:

01001100 01101001 01111010 or decimal 76, 105 and 122

Turn off the first 4 bits of the second character and shift left by 2 positions yields:

00001001 or 9 * 4 = 36

122 / 64 = 1; ignore remainder

36 + 1 = 37 or 00100101

To use the last 6 bits of the third character involves only turning off the two left most bits.

For example:

Input:

01001100 01101001 01111010 or decimal 76, 105 and 122

Turning off the first two bits of the third character yields:

00111010 or 58

Using the four decimal values (19,6,37,58) as indices into our base64 alphabet gives us:

TGl6


Decoding

Base64 decoding is done by essentially reversing the encoding procedure.

Step 1

Lookup each base64 character in our alphabet. Using TGl6 for our encoding example above:

T = 19, G = 6, l = 37, 6 = 58

Converted to binary, these 4 groups are:

00010011 00000110 00100101 00111010

Combining these 4 groups into 3 8-bit groups involves shifting bits around. Each character is treated as being 6 bits in size, the first two bits of each character are ignored. For our example, assume the bits in each character are numbered 1-8 from left to right. The basic steps are:
  1. Take the last 6 bits of the first character and the 3rd/4th bits of the second character.
  2. Take the last 4 bits of the second character and bits 3-6 of the third character.
  3. Take the last 2 bits of the third character and bits 3-8 of the fourth character.
To get the last 6 bits of the first character and the 3rd/4th bits of second character involves:
  1. Shift the first character left two bits and clear bits 7-8
  2. Shift the second character right four bits and clear bits 1-6
  3. Add the results of step 1 and 2 together
Example:

Input:

00010011 00000110 00100101 00111010

Shifting 00010011 (x'13' or decimal 19) left two bits using multiplication by 4:

19 * 4 = 76

Shifting the 00000110 (x'06' or decimal 6) right four bits using division by 16:

6 / 16 = 0; ignore remainder

Add two results together:

76 + 0 = 76

To combine the last 4 bits of the second character and bits 3-6 of the third character involve:
  1. Shift the second character left four bits
  2. Shift the third character right two bits
  3. Add the results of step 1 and 2 together
Example:

Input:

00010011 00000110 00100101 00111010

Shifting 00000110 (x'06' or decimal 6) left four bits using multiplication by 16:

6 * 16 = 96

Shifting 00100101 (x'25' or decimal 37) right two bits using division by 4:

37 / 4 = 9; ignore remainder

Add two results together:

96 + 9 = 105

To get the last two bits of the third character and bits 3-8 of the fourth character involve:
  1. Shift the third character left 6 bits
  2. Add the fourth character
Example:

Input:

00010011 00000110 00100101 00111010

Shifting 00100101 (x'25' or decimal 37) left six bits using multiplication by 64:

37 * 64 = 2368 (x'0940') and dropping the high order byte (x'09') yields x'40' or 64

Add in the fourth character (00111010 x'3A' or decimal 58):

64 + 58 = 122

Taking our three results of 76, 105 and 122 and using the OPAL chr() function, we end up with a final result chr(76) + chr(105) + chr(122) or Liz.

Following are the OPAL proc's that implement the core base64 encode and decode as just reviewed. Assume that stBase64EncodingTable = cnBase64EncodingTable.
Const
  cnBase64Null      = "="
  cnBase64Invalid   = -1
  cnBase64EncodingTable = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
endConst

Proc cmBase64DecodeBlock(
  stBase64Block String,
  var siChar1 SmallInt,
  var siChar2 SmallInt,
  var siChar3 SmallInt)
;
; Given a 4 character base64 block, return 3 character decode
;
var
  siBase64Char1 SmallInt
  siBase64Char2 SmallInt
  siBase64Char3 SmallInt
  siBase64Char4 SmallInt
endVar
  siChar1 = cnBase64Invalid
  siChar2 = cnBase64Invalid
  siChar3 = cnBase64Invalid
  siBase64Char1 = stBase64EncodingTable.search(stBase64Block.substr(1,1)) - 1
  siBase64Char2 = stBase64EncodingTable.search(stBase64Block.substr(2,1)) - 1
  siBase64Char3 = stBase64EncodingTable.search(stBase64Block.substr(3,1)) - 1
  siBase64Char4 = stBase64EncodingTable.search(stBase64Block.substr(4,1)) - 1
  switch
    case siBase64Char1 <> cnBase64Invalid :
      siChar1 = cmShiftLeft2(siBase64Char1)
                + cmBitAnd(cmShiftRight4(iif(siBase64Char2 <>
                                             cnBase64Invalid,
                                             siBase64Char2,0)),3)
  endSwitch
  switch
    case siBase64Char2 <> cnBase64Invalid :
      siChar2 = cmShiftLeft4(cmBitAnd(siBase64Char2,15))
                + cmBitAnd(cmShiftRight2(iif(siBase64Char3 <>
                                             cnBase64Invalid,
                                             siBase64Char3,0)),15)
  endSwitch
  switch
    case siBase64Char3 <> cnBase64Invalid :
      siChar3 = cmShiftLeft6(cmBitAnd(siBase64Char3,3))
                + iif(siBase64Char4 <>
                cnBase64Invalid,siBase64Char4,0)
  endSwitch
endProc
Proc cmBase64EncodeBlock(
  siChar1 SmallInt,
  siChar2 SmallInt,
  siChar3 SmallInt) String
;
; Given a three character block, return 4 character base64 encode
;
var
  siIndex1  SmallInt
  siIndex2  SmallInt
  siIndex3  SmallInt
  siIndex4  SmallInt
endVar
;
; Use first 6 bits of first character
;
  siIndex1 = cmBitAnd(cmShiftRight2(siChar1),63)  ;63 = 0x3f (0011 1111)
;
; Use last 2 bits of first character +
; first 4 bits of second character
;
  siIndex2 = cmShiftLeft4(cmBitAnd(siChar1,3))
             + cmBitAnd(cmShiftRight4(siChar2),15)  ;15 = 0x0f (0000 1111)
;
; Use last 4 bits of second character +
; first 2 bits of third charcter
;
  siIndex3 = cmShiftLeft2(  cmBitAnd(siChar2,15))
             + cmBitAnd(cmShiftRight6(siChar3),3)
;
; Use last 6 bits of third character
;
  siIndex4 = cmBitAnd(siChar3,63)
  return stBase64EncodingTable.substr(siIndex1 + 1,1) +
         stBase64EncodingTable.substr(siIndex2 + 1,1) +
         stBase64EncodingTable.substr(siIndex3 + 1,1) +
         stBase64EncodingTable.substr(siIndex4 + 1,1)
endProc
Proc cmBitAnd(
  siAny SmallInt,
  siMask SmallInt) SmallInt
;
; Apply bitAnd mask siMask to siAny
;
  return siAny.bitAnd(siMask)
endProc
Proc cmShiftRight2(siAny SmallInt) SmallInt
;
; Shift value right two bits
;
  return siAny / 4
endProc
Proc cmShiftRight4(siAny SmallInt) SmallInt
;
; Shift value right four bits
;
  return siAny / 16
endProc
Proc cmShiftRight6(siAny SmallInt) SmallInt
;
; Shift value right six bits
;
  return siAny / 64
endProc
Proc cmShiftLeft2(siAny SmallInt) SmallInt
;
; Shift value left two bits
;
  return siAny * 4
endProc
Proc cmShiftLeft4(siAny SmallInt) SmallInt
;
; Shift value left four bits
;
  return siAny * 16
endProc
Proc cmShiftLeft6(siAny SmallInt) SmallInt
;
; Shift value left six bits
;
  return siAny * 64
endProc
To use these basic, low level procedures, we need two methods that feed encoding and decoding blocks. Assume that stCRLF = chr(13) + chr(10).
method Base64Encode(stAny String) String
;
; Base 64 Encode a String Type
;
var
  stEncoded       String
  liTotalBlocks   LongInt
  liInputSize     LongInt
  siOddSize       SmallInt
  liIndex         LongInt
  stLastBlock     String
  siChar1         SmallInt
  siChar2         SmallInt
  siChar3         SmallInt
  liBlocksWritten LongInt
endVar
;
; Initialize local variables
;
  liInputSize = stAny.sizeEx()
  stEncoded = blank()
  liBlocksWritten = 0
;
; Calculate number of encoding blocks to process
; 
  liTotalBlocks = liInputSize / 3
;
; Determine if last block is < 3 characters in size
;
  siOddSize = smallInt(liInputSize.mod(3))
;
; Encode each 3 byte block to 4 base64 characters adding
; a CR/LF after each 19 blocks or 76 base64 characters.
;
  for liIndex from 1 to liInputSize - siOddSize step 3
    liBlocksWritten = liBlocksWritten + 1
    stEncoded = stEncoded
                + cmBase64EncodeBlock(
                  ansiCode(stAny.substr(liIndex,1)),
                  ansiCode(stAny.substr(liIndex + 1,1)),
                  ansiCode(stAny.substr(liIndex + 2,1)))
                + iif(liBlocksWritten.mod(19) = 0,stCRLF,"")
  endFor
;
; Check for odd size last block
;
  liBlocksWritten = liBlocksWritten + 1
  switch
    case siOddSize = 0 :
    case siOddSize = 1 :
      stLastBlock = cmBase64EncodeBlock(
                      ansiCode(stAny.substr(liInputSize,1)),
                      0,
                      0)
      stEncoded = stEncoded
                  + stLastBlock.substr(1,2)
                  + cnBase64Null
                  + cnBase64Null
                  + iif(liBlocksWritten.mod(19) = 0,stCRLF,"")
    otherwise :
      stLastBlock = cmBase64EncodeBlock(
                      ansiCode(stAny.substr(liInputSize - 1,1)),
                      ansiCode(stAny.substr(liInputSize,1)),
                      0)
      stEncoded = stEncoded
                    + stLastBlock.substr(1,3)
                    + cnBase64Null
                    + iif(liBlocksWritten.mod(19) = 0,stCRLF,"")
  endSwitch
  return stEncoded
endMethod

method Base64Decode(stEncoded String) String
;
; Decode a Base 64 to String Type
;
var
  stDecoded     String
  liIndex       LongInt
  liLine        LongInt
  siChar1       SmallInt
  siChar2       SmallInt
  siChar3       SmallInt
  arEncodeBlock Array[] String
endVar
;
; Initialize local variables
;
  stDecoded = blank()
;
; Break into substrings based on CR/LF
;
  stEncoded.breakApart(arEncodeBlock,stCRLF)
;
; Loop through substrings and decode
;
  for liLine from 1 to arEncodeBlock.size()
    stEncoded = arEncodeBlock[liLine]
    switch
      case isBlank(stEncoded) = True :
        loop
    endSwitch
    for liIndex from 1 to stEncoded.sizeEx() step 4
      cmBase64DecodeBlock(
        stEncoded.substr(liIndex,4),
        siChar1,
        siChar2,
        siChar3)
      stDecoded = stDecoded
                  + iif(siChar1 <> cnBase64Invalid,chr(siChar1),"")
                  + iif(siChar2 <> cnBase64Invalid,chr(siChar2),"")
                  + iif(siChar3 <> cnBase64Invalid,chr(siChar3),"")
    endFor
  endFor
  return stDecoded
endMethod

Conclusion

We now have methods that support encoding and decoding using base64. The presented methods and procedures only deal with String Types and a nice project would be to develop other supporting methods for dealing with any type of input, including files. I'll leave this to you, the reader, to explore and have fun with.


Discussion of this article


 Feedback |  Paradox Day |  Who Uses Paradox |  I Use Paradox |  Downloads 


 The information provided on this Web site is not in any way sponsored or endorsed by Corel Corporation.
 Paradox is a registered trademark of Corel Corporation.


 Modified: 19 Jul 2003
 Terms of Use / Legal Disclaimer


 Copyright © 2001- 2003 Paradox Community. All rights reserved. 
 Company and product names are trademarks or registered trademarks of their respective companies. 
 Authors hold the copyrights to their own works. Please contact the author of any article for details.