DOC: This method needs documented or nodoc’d
Sanitize a single string.
List of dependencies that are used for development
Duplicates array_attributes
from other_spec
so state isn’t shared.
Checks if this specification meets the requirement of dependency
.
Uninstalls gem spec
Constructs the default Hash
of patterns.
Constructs the default Hash
of Regexp’s.
Constructs the default Hash
of patterns.
Constructs the default Hash
of Regexp’s.
Returns the number of threads waiting on the queue.
Returns the number of threads waiting on the queue.
Returns a conversion path.
p Encoding::Converter.search_convpath("ISO-8859-1", "EUC-JP") #=> [[#<Encoding:ISO-8859-1>, #<Encoding:UTF-8>], # [#<Encoding:UTF-8>, #<Encoding:EUC-JP>]] p Encoding::Converter.search_convpath("ISO-8859-1", "EUC-JP", universal_newline: true) or p Encoding::Converter.search_convpath("ISO-8859-1", "EUC-JP", newline: :universal) #=> [[#<Encoding:ISO-8859-1>, #<Encoding:UTF-8>], # [#<Encoding:UTF-8>, #<Encoding:EUC-JP>], # "universal_newline"] p Encoding::Converter.search_convpath("ISO-8859-1", "UTF-32BE", universal_newline: true) or p Encoding::Converter.search_convpath("ISO-8859-1", "UTF-32BE", newline: :universal) #=> [[#<Encoding:ISO-8859-1>, #<Encoding:UTF-8>], # "universal_newline", # [#<Encoding:UTF-8>, #<Encoding:UTF-32BE>]]
primitive_errinfo
returns important information regarding the last error as a 5-element array:
[result, enc1, enc2, error_bytes, readagain_bytes]
result is the last result of primitive_convert.
Other elements are only meaningful when result is :invalid_byte_sequence, :incomplete_input or :undefined_conversion.
enc1 and enc2 indicate a conversion step as a pair of strings. For example, a converter from EUC-JP to ISO-8859-1 converts a string as follows: EUC-JP -> UTF-8 -> ISO-8859-1. So [enc1, enc2] is either [“EUC-JP”, “UTF-8”] or [“UTF-8”, “ISO-8859-1”].
error_bytes and readagain_bytes indicate the byte sequences which caused the error. error_bytes is discarded portion. readagain_bytes is buffered portion which is read again on next conversion.
Example:
# \xff is invalid as EUC-JP. ec = Encoding::Converter.new("EUC-JP", "Shift_JIS") ec.primitive_convert(src="\xff", dst="", nil, 10) p ec.primitive_errinfo #=> [:invalid_byte_sequence, "EUC-JP", "Shift_JIS", "\xFF", ""] # HIRAGANA LETTER A (\xa4\xa2 in EUC-JP) is not representable in ISO-8859-1. # Since this error is occur in UTF-8 to ISO-8859-1 conversion, # error_bytes is HIRAGANA LETTER A in UTF-8 (\xE3\x81\x82). ec = Encoding::Converter.new("EUC-JP", "ISO-8859-1") ec.primitive_convert(src="\xa4\xa2", dst="", nil, 10) p ec.primitive_errinfo #=> [:undefined_conversion, "UTF-8", "ISO-8859-1", "\xE3\x81\x82", ""] # partial character is invalid ec = Encoding::Converter.new("EUC-JP", "ISO-8859-1") ec.primitive_convert(src="\xa4", dst="", nil, 10) p ec.primitive_errinfo #=> [:incomplete_input, "EUC-JP", "UTF-8", "\xA4", ""] # Encoding::Converter::PARTIAL_INPUT prevents invalid errors by # partial characters. ec = Encoding::Converter.new("EUC-JP", "ISO-8859-1") ec.primitive_convert(src="\xa4", dst="", nil, 10, Encoding::Converter::PARTIAL_INPUT) p ec.primitive_errinfo #=> [:source_buffer_empty, nil, nil, nil, nil] # \xd8\x00\x00@ is invalid as UTF-16BE because # no low surrogate after high surrogate (\xd8\x00). # It is detected by 3rd byte (\00) which is part of next character. # So the high surrogate (\xd8\x00) is discarded and # the 3rd byte is read again later. # Since the byte is buffered in ec, it is dropped from src. ec = Encoding::Converter.new("UTF-16BE", "UTF-8") ec.primitive_convert(src="\xd8\x00\x00@", dst="", nil, 10) p ec.primitive_errinfo #=> [:invalid_byte_sequence, "UTF-16BE", "UTF-8", "\xD8\x00", "\x00"] p src #=> "@" # Similar to UTF-16BE, \x00\xd8@\x00 is invalid as UTF-16LE. # The problem is detected by 4th byte. ec = Encoding::Converter.new("UTF-16LE", "UTF-8") ec.primitive_convert(src="\x00\xd8@\x00", dst="", nil, 10) p ec.primitive_errinfo #=> [:invalid_byte_sequence, "UTF-16LE", "UTF-8", "\x00\xD8", "@\x00"] p src #=> ""
Synonym for CGI.unescapeHTML(str)