Package oracle.sql

Class CharacterSet

java.lang.Object
oracle.sql.CharacterSet
Direct Known Subclasses:
CharacterSetWithConverter

public abstract class CharacterSet extends Object
This class encapsulates methods and attributes of the character sets defined by Oracle. It also defines a set of character set IDs that their character conversions are supported by Oracle JDBC.

Most methods are of conversions between character representations.

There are no public constructors. To create a CharacterSet use oracle.sql.CharacterSetFactory. There is no notion of "unsupported" character set. CharacterSet's can be created with any oracleId. However, there is a notion of unsupported conversions and the current implementation is limited to the small number of character sets for which constants are defined in the class

There are two variants of an operation (e.g. convert vs. convertUnshared) the plain version is the fast (but possibly unsafe) one.

The descriptions of methods in this class use the phrase "bytes in oracleId representation". What this means is that the bytes can be interpreted as a sequence of characters in the character set defined by oracleId. Both what characters are available and how they are represented as sequences of bytes is determined by oracleId.

  • Field Details

  • Method Details

    • make

      public static CharacterSet make(int oracleId)
      Factory. A factory is used rather than a constructor because CharacterSet is abstract.
      Parameters:
      oracleId - the number of the Oracle character set. A list of official Oracle character sets is maintained by ...
      Returns:
      CharacterSet for oracleId.
    • toString

      public String toString()
      The official name of the character set.
      Overrides:
      toString in class Object
      Returns:
      the name of the character set
    • isLossyFrom

      public abstract boolean isLossyFrom(CharacterSet from)
      A conversion looses information if the mapping is not invertible. (A mathematicial would say that the map of characters in from to this is not injective.)
      Parameters:
      from - a CharacterSet being tested for compatibility with this CharacterSet.
      Returns:
      true if characters in the from character set can be mapped uniquely to characters in oracleId representation.
    • isConvertibleFrom

      public abstract boolean isConvertibleFrom(CharacterSet source)
      Are conversions supported.
      Parameters:
      source - a CharacterSet to inquire about
      Returns:
      true if conversion from source to oracleId is supported. If it isn't supported attempts to convert will always throw exceptions.
    • isUnicode

      public boolean isUnicode()
      Is this a Unicode Character Set.
      Returns:
      true if this CharacterSet is an encoding of Unicode
    • getOracleId

      public int getOracleId()
      The integer that identifies the character set.
      Returns:
      Oracle character set ID
    • equals

      public boolean equals(Object rhs)
      Two CharacterSet's are equal when their oracleId's are equal
      Overrides:
      equals in class Object
      Parameters:
      rhs - the target character set
      Returns:
      true if the given CharacterSet object equals to this object
    • hashCode

      public int hashCode()
      Implements a hash based on oracleId
      Overrides:
      hashCode in class Object
      Returns:
      a hash code
    • toStringWithReplacement

      public abstract String toStringWithReplacement(byte[] bytes, int offset, int count)
      Convert bytes in oracleId representation to a String. If a character has no Unicode representation the effect is unspecified. The conversion might omit it, or replace it with a special character. The preferred result is replacement by a single character, but it is not guaranteed. If the conversion isn't supported at all, the result may be a fixed string.
      Parameters:
      bytes - a array containing characters represented in this character set.
      offset - the index of the first byte or the charcters
      count - the number of bytes to be converted.
      Returns:
      the String resulting from converting to UCS-2.
    • toString

      public String toString(byte[] bytes, int offset, int count) throws SQLException
      Convert bytes in oracleId representation to a String. The difference between toStringInvertible and plain toString is that toStringInvertible will throw an exception when toString would make some replacement.
      Parameters:
      bytes - a array containing characters represented in this character set.
      offset - the index of the first byte or the charcters
      count - the number of bytes to be converted.
      Returns:
      the String resulting from converting to UCS-2.
      Throws:
      SQLException - when conversion is not supported.
    • convert

      public abstract byte[] convert(String s) throws SQLException
      Convert a String to bytes in oracleId representation.
      Returns:
      an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in String.
      Throws:
      SQLException - when the oracleId does not support conversion from Unicode.
      SQLException - when s contains a character that cannot be converted.
    • convertWithReplacement

      public abstract byte[] convertWithReplacement(String s)
      Convert a String to bytes in oracleId representation. A String is always produced even when the conversion isn't supported or it contains characters that do not have a representation in oracleId. The usual conversion is to replace characters that don't have a representation with some fixed character, but that is not guranteed.
      Returns:
      an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in String.
    • convertWithReplacement

      public byte[] convertWithReplacement(char[] chars, int charOffset, byte[] bytes, int byteOffset, int[] nchars)
      Similar to convertWithReplacement(String s); Instead of a string, a char[] + offset with a length stored in nchars[0] will be converted.
      Returns:
      an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in the char[]. nchars[0] has the bytes length.
    • convert

      public abstract byte[] convert(CharacterSet from, byte[] source, int offset, int count) throws SQLException
      Converts bytes in some representation to oracleId representation. Note that the input is not guaranteed to be different from the output. If a copy is always wanted then use convertUnshared.
      Parameters:
      from - the character set of the input bytes
      source - an array of bytes containing the bytes to be converted
      offset - the index of the first byte to be converted
      count - the number of bytes to be converted
      Returns:
      a byte array in the Oracle character set
      Throws:
      SQLException - if the conversion is not supported
      SQLException - if some character cannot be converted. This exception is not guaranteed to be thrown. For some conversions a replacement character may be used instead.
    • convertUnshared

      public byte[] convertUnshared(CharacterSet from, byte[] source, int offset, int count) throws SQLException
      Converts bytes in some representation to oracleId representation. This is identical to convert except that it always returns a copy of it's input.
      Parameters:
      from - the character set of the input bytes
      source - an array of bytes containing the bytes to be converted
      offset - the index of the first byte to be converted
      count - the number of bytes to be converted
      Returns:
      an array containing a representation as an oracleId of characters in the source.
      Throws:
      SQLException - if the conversion is not supported.
    • UTFToString

      public static final String UTFToString(byte[] bytes, int offset, int nbytes, boolean useReplacementChar) throws SQLException
      Convert a sequence of bytes in UTF8 to a String this function will to allocate the chars array
      Parameters:
      bytes - containing the UTF8 string
      nbytes - of bytes
      useReplacementChar - if true invalid characters are replaced by replacement characters.
      Returns:
      the number of char wrote to the chars array
      Throws:
      SQLException
    • UTFToString

      public static final String UTFToString(byte[] bytes, int offset, int nbytes) throws SQLException
      Convert a sequence of bytes in UTF8 to a String this function will to allocate the chars array
      Parameters:
      bytes - containing the UTF8 string
      nbytes - of bytes
      Returns:
      the number of char wrote to the chars array
      Throws:
      SQLException
    • UTFToJavaChar

      public static final char[] UTFToJavaChar(byte[] bytes, int offset, int count) throws SQLException
      Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
      Parameters:
      bytes - the array holding the UTF8 bytes
      offset - the index of the first byte
      count - the number of bytes in the UFT8 sequence.
      Returns:
      an array of char's equivalent to the UTF8 sequence.
      Throws:
      SQLException - if any error occurs
    • UTFToJavaChar

      public static final char[] UTFToJavaChar(byte[] bytes, int offset, int count, boolean useReplacementChar) throws SQLException
      Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
      Parameters:
      bytes - the array holding the UTF8 bytes
      offset - the index of the first byte
      count - the number of bytes in the UFT8 sequence.
      useReplacementChar - if true invalid characters are replaced by replacement characters.
      Returns:
      an array of char's equivalent to the UTF8 sequence.
      Throws:
      SQLException - if any error occurs
    • UTFToJavaCharWithReplacement

      public static final char[] UTFToJavaCharWithReplacement(byte[] bytes, int offset, int count)
      Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
      Parameters:
      bytes - the array holding the UTF8 bytes
      offset - the index of the first byte
      count - the number of bytes in the UFT8 sequence.
      Returns:
      an array of char's equivalent to the UTF8 sequence.
      Throws:
      IllegalStateException - if any error occurs
    • convertUTFBytesToJavaChars

      public static final int convertUTFBytesToJavaChars(byte[] bytes, int offset, char[] chars, int chars_offset, int[] countArr, boolean convertWithReplacement) throws SQLException
      Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that the maximum length of a character is 3 bytes. So a surrogate pair will be represented as 6 bytes (2 times 3 bytes). To support surrogate pairs as 4 bytes, use AL32UTF8. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
      Parameters:
      bytes - the array holding the UTF8 bytes
      offset - the index of the first byte
      chars - the array of holding the UTF-16 char array
      chars_offset - the idnex of the first char that will be written
      countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
      convertWithReplacement - set to true to use replacement character for illegal sequences
      Returns:
      the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 3) or because the char[] was too short (can be more that 3).
      Throws:
      SQLException - if invalid, illegal UTF data is given
    • convertUTFBytesToJavaChars

      public static final int convertUTFBytesToJavaChars(byte[] bytes, int offset, char[] chars, int chars_offset, int[] countArr, boolean convertWithReplacement, int charSize) throws SQLException
      Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that the maximum length of a character is 3 bytes. So a surrogate pair will be represented as 6 bytes (2 times 3 bytes). To support surrogate pairs as 4 bytes, use AL32UTF8. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves. Same as convertUTFBytesToJavaChars(byte[],int,char[],int,int[],boolean) with an additional argument 'charSize' which is the number of chars available in the char array. Note that if chars_offset+charSize>char.length, then an IndexArrayOutOfBound exception will be thrown. This method has been optimized for speed: 1) 'switch -case' has been replaced with a series of logical shifts and zero comparisons; 2) the internal loop with attempt to optimize array bounds checks away has been added.
      Parameters:
      bytes - the array holding the UTF8 bytes
      offset - the index of the first byte
      chars - the array of holding the UTF-16 char array
      chars_offset - the idnex of the first char that will be written
      countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
      convertWithReplacement - set to true to use replacement character for illegal sequences
      Returns:
      the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 3) or because the char[] was too short (can be more that 3).
      Throws:
      SQLException - if invalid, illegal UTF data is given
    • stringToUTF

      public static final byte[] stringToUTF(String str)
      Convert the str to a byte array that in UTF8 representation.
    • convertJavaCharsToUTFBytes

      public static int convertJavaCharsToUTFBytes(char[] chars, int chars_offset, byte[] bytes, int bytes_begin, int chars_count)
      Convert char's to the UTF8 representation. No validation is performed. An invocation of this method is equivalent to calling convertJavaCharsToUTFBytes(char[], int, byte[], int, int, int[]) with a null codePointCount argument.
      Parameters:
      chars - a source string in an array of chars
      chars_offset - an offset to start copying in the source string
      chars_count - a length to copy from the source string
      bytes - a destination byte array
      bytes_begin - an offset to start copying in the destination byte array
      Returns:
      the length copy operation was performed
    • convertJavaCharsToUTFBytes

      public static int convertJavaCharsToUTFBytes(char[] chars, int chars_offset, byte[] bytes, int bytes_begin, int chars_count, int[] codePointCount)
      Convert char's to the UTF8 representation. No validation is performed.
      Parameters:
      chars - a source string in an array of chars
      chars_offset - an offset to start copying in the source string
      chars_count - a length to copy from the source string
      bytes - a destination byte array
      bytes_begin - an offset to start copying in the destination byte array
      codePointCount - An array into which this method stores the number of code points it has converted. If this array is null or length 0, it is ignored. Unpaired surrogates are counted as one code point each.
      Returns:
      the length copy operation was performed
    • stringUTFLength

      public static final int stringUTFLength(String s)
      Returns the number of bytes in the UTF8 representation of a String
      Parameters:
      s - a Java string
      Returns:
      The number of bytes in the UTF8 encoding
    • AL32UTF8ToString

      public static final String AL32UTF8ToString(byte[] bytes, int offset, int nbytes)
      Convert a sequence of bytes in AL32UTF8 format to a String. this function will allocate the memory for holding the returning String.
      Parameters:
      bytes - containing the AL32UTF8 string
      offset - an offset to start conversion
      nbytes - of bytes
      Returns:
      the converted String
    • AL32UTF8ToString

      public static final String AL32UTF8ToString(byte[] bytes, int offset, int nbytes, boolean useReplacementCharacter)
    • AL32UTF8ToJavaChar

      public static final char[] AL32UTF8ToJavaChar(byte[] bytes, int offset, int count, boolean useReplacementCharacter) throws SQLException
      Converts an AL32UTF8 byte array to an array of char. This function will allocate a char array for holding the returning result.
      Parameters:
      bytes - an AL32UTF8 byte array
      offset - an offset to start conversion
      count - number of bytes to be converted.
      Returns:
      an array of char data
      Throws:
      SQLException
    • convertAL32UTF8BytesToJavaChars

      public static final int convertAL32UTF8BytesToJavaChars(byte[] bytes, int offsetBytes, char[] chars, int offsetChars, int[] countArr, boolean convertWithReplacement) throws SQLException
      Convert a sequence of bytes in AL32UTF8 to an array of char's. A char in AL32UTF8 can be represented with up to 4 bytes. The only difference between UTF8 (Oracle's) and AL32UTF8 is the representation of a surrogate pair. In AL32UTF8 a surrogate pair is represented with 4 bytes instead of 6 bytes in UTF8.
      Parameters:
      bytes - the array holding the AL32UTF8 bytes
      offsetBytes - the index of the first byte
      chars - the array of holding the UTF-16 char array
      offsetChars - the index of the first char that will be written
      countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
      convertWithReplacement - set to true to use replacement character for illegal sequences
      Returns:
      the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 4) or because the char[] was too short (can be more than 4).
      Throws:
      SQLException - if invalid, illegal UTF data is given
    • convertAL32UTF8BytesToJavaChars

      public static final int convertAL32UTF8BytesToJavaChars(byte[] bytes, int offsetBytes, char[] chars, int offsetChars, int[] countArr, boolean convertWithReplacement, int charSize) throws SQLException
      Same as convertAL32UTF8BytesToJavaChars(byte[],int,char[],int,int[],boolean) with an additional argument 'charSize' which is the number of chars available in the char array. Note that if chars_offset+charSize>char.length, then an IndexArrayOutOfBound exception will be thrown. This method has been optimized for speed: 1) 'switch -case' has been replaced with a series of logical shifts and zero comparisons; 2) the internal loop with attempt to optimize array bounds checks away has been added.
      Throws:
      SQLException
    • stringToAL32UTF8

      public static final byte[] stringToAL32UTF8(String str)
    • convertJavaCharsToAL32UTF8Bytes

      public static int convertJavaCharsToAL32UTF8Bytes(char[] chars, int chars_offset, byte[] bytes, int bytes_begin, int chars_count)
      Convert char's to the UTF-8 representation. No validation is performed except surrogate pairs . An invocation of this method is equivalent to calling convertJavaCharsToAL32UTF8Bytes(char[], int, byte[], int, int, int[]) with a null codePointCount argument.
      Parameters:
      chars - a source string in an array of chars
      chars_offset - an offset to start copying in the source string
      bytes - a destination byte array
      bytes_begin - an offset to start copying in the destination byte array
      chars_count - a length to copy from the source string
      Returns:
      the length copy operation was performed
    • convertJavaCharsToAL32UTF8Bytes

      public static int convertJavaCharsToAL32UTF8Bytes(char[] chars, int chars_offset, byte[] bytes, int bytes_begin, int chars_count, int[] codePointCount)
      Convert char's to the UTF-8 representation. No validation is performed except surrogate pairs
      Parameters:
      chars - a source string in an array of chars
      chars_offset - an offset to start copying in the source string
      bytes - a destination byte array
      bytes_begin - an offset to start copying in the destination byte array
      chars_count - a length to copy from the source string
      codePointCount - An array into which this method stores the number of code points it has converted. If this array is null or length 0, it is ignored. Unpaired surrogates are counted as one code point each.
      Returns:
      the length copy operation was performed
    • string32UTF8Length

      public static final int string32UTF8Length(String s)
      Returns the number of bytes in the UTF-8 representation of a String

      This method doesn't check neither invalid- nor illegal-UTF sequence.

      Parameters:
      s - a UTF-16 string to count the number of bytes in UTF8 format
      Returns:
      The number of bytes in the UTF-8 representaion of a string
    • AL16UTF16BytesToString

      public static final String AL16UTF16BytesToString(byte[] bytes, int nbytes)
      Convert a sequence of bytes in AL16UTF16 to a String this function will allocate a chars array
      Parameters:
      bytes - containing the AL16UTF16 string
      nbytes - of bytes
      Returns:
      a newly generated String
    • AL16UTF16BytesToJavaChars

      public static final int AL16UTF16BytesToJavaChars(byte[] bytes, int nbytes, char[] chars)
      Convert a sequence of bytes in AL16UTF16 to an array of chars caller needs to allocate the chars array
      Parameters:
      bytes - containing the AL16UTF16 string
      nbytes - of bytes
      chars - char array which the UCS2 string will be returned in
      Returns:
      the number of char wrote to the chars array
    • convertAL16UTF16BytesToJavaChars

      public static final int convertAL16UTF16BytesToJavaChars(byte[] bytes, int offset, char[] chars, int chars_offset, int count, boolean convertWithReplacement) throws SQLException
      Converts a sequence of bytes in AL16UTF16 to an array of char's.
      Parameters:
      bytes - the array holding the AL16UTF16 bytes
      offset - the index of the first byte
      chars - the array of holding the UTF-16 char array
      chars_offset - the index of the first char
      count - the number of bytes in the AL16UTF16 sequence.
      convertWithReplacement - set to true to use replacement character for illegal
      Returns:
      the number of chars written
      Throws:
      SQLException - if invalid, illegal UTF data is given
    • convertAL16UTF16LEBytesToJavaChars

      public static final int convertAL16UTF16LEBytesToJavaChars(byte[] bytes, int offset, char[] chars, int chars_offset, int count, boolean convertWithReplacement) throws SQLException
      Converts a sequence of bytes in AL16UTF16LE to an array of char's.
      Parameters:
      bytes - the array holding the AL16UTF16LE bytes
      offset - the index of the first byte
      chars - the array of holding the UTF16UTF16LE char array
      chars_offset - the index of the first char
      count - the number of bytes in the AL16UTF16LE sequence.
      convertWithReplacement - set to true to use replacement character for illegal
      Returns:
      an array of char's equivalent to the AL16UTF16LE sequence.
      Throws:
      SQLException - if invalid, illegal UTF data is given
    • stringToAL16UTF16Bytes

      public static final byte[] stringToAL16UTF16Bytes(String str)
      Convert a String to an array of bytes this function will allocate the bytes array
      Parameters:
      str - containing the UCS2 string
      Returns:
      the AL16UTF16 byte array
    • javaCharsToAL16UTF16Bytes

      public static final int javaCharsToAL16UTF16Bytes(char[] chars, int nchars, byte[] bytes)
      Convert a sequence of chars in UCS2 to an array of bytes caller needs to allocate the bytes array
      Parameters:
      chars - containing the UCS2 string
      nchars - of chars
      bytes - byte array which the AL16UTF16 string will be returned in
      Returns:
      the number of bytes wrote to the bytes array
    • convertJavaCharsToAL16UTF16Bytes

      public static final int convertJavaCharsToAL16UTF16Bytes(char[] chars, int chars_offset, byte[] bytes, int bytes_offset, int nchars)
    • stringToAL16UTF16LEBytes

      public static final byte[] stringToAL16UTF16LEBytes(String str)
      Convert a String to an array of bytes this function will allocate the bytes array
      Parameters:
      str - containing the UCS2 string
      Returns:
      the AL16UTF16LE byte array
    • javaCharsToAL16UTF16LEBytes

      public static final int javaCharsToAL16UTF16LEBytes(char[] chars, int nchars, byte[] bytes)
      Convert a sequence of chars in UCS2 to an array of bytes caller needs to allocate the bytes array
      Parameters:
      chars - containing the UCS2 string
      nchars - of chars
      bytes - byte array which the AL16UTF16LE string will be returned in
      Returns:
      the number of bytes wrote to the bytes array
    • convertJavaCharsToAL16UTF16LEBytes

      public static final int convertJavaCharsToAL16UTF16LEBytes(char[] chars, int chars_offset, byte[] bytes, int bytes_offset, int nchars)
    • convertASCIIBytesToJavaChars

      public static final int convertASCIIBytesToJavaChars(byte[] bytes, int bytes_offset, char[] chars, int chars_offset, int count) throws SQLException
      convert a byte array in ascii to a Java char array. The caller needs to allocate the buffer of chars.
      Parameters:
      bytes - input bytes
      bytes_offset - the starting position to convert
      chars - output Java char array (buffer is allocated by the caller)
      chars_offset - starting position to store the Java char array
      count - number of characters in chars
      Returns:
      the number of Java character written into chars[]
      Throws:
      SQLException - if errors occurred
    • convertJavaCharsToASCIIBytes

      public static final int convertJavaCharsToASCIIBytes(char[] chars, int chars_offset, byte[] bytes, int bytes_offset, int nchars) throws SQLException
      convert a Java char array to a byte array in ascii. The caller needs to allocate the buffer of bytes.
      Parameters:
      chars - input Java char array
      chars_offset - input the starting position to convert
      bytes - output the converted byte array in ascii
      bytes_offset - input the starting position to hold the returning bytes
      Returns:
      the number of chars converted
      Throws:
      SQLException - if errors occurred
    • convertJavaCharsToASCIIBytes

      public static final int convertJavaCharsToASCIIBytes(char[] chars, int chars_offset, byte[] bytes, int bytes_offset, int nchars, boolean strictConversion) throws SQLException
      Throws:
      SQLException
    • convertJavaCharsToISOLATIN1Bytes

      public static final int convertJavaCharsToISOLATIN1Bytes(char[] chars, int chars_offset, byte[] bytes, int bytes_offset, int nchars) throws SQLException
      Throws:
      SQLException
    • stringToASCII

      public static final byte[] stringToASCII(String str)
      convert a String to a byte array in ascii. This method will allocate the byte array.
      Parameters:
      str - input the String to be converted
      Returns:
      the byte array in ascii
    • convertUTF32toUTF16

      public static final long convertUTF32toUTF16(long ucs4ch)
    • encodedByteLength

      public int encodedByteLength(String s) throws SQLException
      Return the length of the byte array which would result if the String were encoded in this character set
      Parameters:
      s - is a Java String
      Returns:
      the length of the encoded bytes
      Throws:
      SQLException
    • encodedByteLength

      public int encodedByteLength(char[] carray)
      Return the length of the byte array which would result if the char array were encoded in this character set
      Parameters:
      carray - is a char array
      Returns:
      the length of the encoded bytes
    • getConnectionDuringExceptionHandling

      protected oracle.jdbc.internal.OracleConnection getConnectionDuringExceptionHandling()
    • isUnknown

      public boolean isUnknown()