The first form returns a copy of str
transcoded to encoding encoding
. The second form returns a copy of str
transcoded from src_encoding to dst_encoding. The last form returns a copy of str
transcoded to Encoding.default_internal
.
By default, the first and second form raise Encoding::UndefinedConversionError
for characters that are undefined in the destination encoding, and Encoding::InvalidByteSequenceError
for invalid byte sequences in the source encoding. The last form by default does not raise exceptions but uses replacement strings.
The options
Hash
gives details for conversion and can have the following keys:
If the value is :replace
, encode
replaces invalid byte sequences in str
with the replacement character. The default is to raise the Encoding::InvalidByteSequenceError
exception
If the value is :replace
, encode
replaces characters which are undefined in the destination encoding with the replacement character. The default is to raise the Encoding::UndefinedConversionError
.
Sets the replacement string to the given value. The default replacement string is “uFFFD” for Unicode encoding forms, and “?” otherwise.
Sets the replacement string by the given object for undefined character. The object should be a Hash
, a Proc
, a Method
, or an object which has [] method. Its key is an undefined character encoded in the source encoding of current transcoder. Its value can be any encoding until it can be converted into the destination encoding of the transcoder.
The value must be :text
or :attr
. If the value is :text
encode
replaces undefined characters with their (upper-case hexadecimal) numeric character references. ‘&’, ‘<’, and ‘>’ are converted to “&”, “<”, and “>”, respectively. If the value is :attr
, encode
also quotes the replacement result (using ‘“’), and replaces ‘”’ with “"”.
Replaces LF (“n”) with CR (“r”) if value is true.
Replaces LF (“n”) with CRLF (“rn”) if value is true.
Replaces CRLF (“rn”) and CR (“r”) with LF (“n”) if value is true.
The first form transcodes the contents of str from str.encoding to encoding
. The second form transcodes the contents of str from src_encoding to dst_encoding. The options Hash
gives details for conversion. See String#encode
for details. Returns the string even if no changes were made.
Returns self
.
If called on a subclass of String, converts the receiver to a String object.
Changes the encoding to encoding
and returns self.
Returns true for a string which is encoded correctly.
"\xc2\xa1".force_encoding("UTF-8").valid_encoding? #=> true "\xc2".force_encoding("UTF-8").valid_encoding? #=> false "\x80".force_encoding("UTF-8").valid_encoding? #=> false
Processes a copy of str as described under String#tr
, then removes duplicate characters in regions that were affected by the translation.
"hello".tr_s('l', 'r') #=> "hero" "hello".tr_s('el', '*') #=> "h*o" "hello".tr_s('el', 'hx') #=> "hhxo"
Try to convert obj into a String, using to_str
method. Returns converted string or nil if obj cannot be converted for any reason.
String.try_convert("str") #=> "str" String.try_convert(/re/) #=> nil
Replaces the contents and taintedness of str with the corresponding values in other_str.
s = "hello" #=> "hello" s.replace "world" #=> "world"
Returns true if str
starts with one of the prefixes
given.
"hello".start_with?("hell") #=> true # returns true if one of the prefixes matches. "hello".start_with?("heaven", "hell") #=> true "hello".start_with?("heaven", "paradise") #=> false
Performs String#tr_s
processing on str in place, returning str, or nil
if no changes were made.
Splits str using the supplied parameter as the record separator ($/
by default), passing each substring in turn to the supplied block. If a zero-length record separator is supplied, the string is split into paragraphs delimited by multiple successive newlines.
If no block is given, an enumerator is returned instead.
print "Example one\n" "hello\nworld".each_line {|s| p s} print "Example two\n" "hello\nworld".each_line('l') {|s| p s} print "Example three\n" "hello\n\n\nworld".each_line('') {|s| p s}
produces:
Example one "hello\n" "world" Example two "hel" "l" "o\nworl" "d" Example three "hello\n\n\n" "world"
Passes the Integer
ordinal of each character in str, also known as a codepoint when applied to Unicode strings to the given block.
If no block is given, an enumerator is returned instead.
"hello\u0639".each_codepoint {|c| print c, ' ' }
produces:
104 101 108 108 111 1593
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
A number is converted to a string as follows
NaN is converted to the string NaN
positive zero is converted to the string 0
negative zero is converted to the string 0
positive infinity is converted to the string Infinity
negative infinity is converted to the string -Infinity
if the number is an integer, the number is represented in decimal form as a Number with no decimal point and no leading zeros, preceded by a minus sign (-) if the number is negative
otherwise, the number is represented in decimal form as a Number including a decimal point with at least one digit before the decimal point and at least one digit after the decimal point, preceded by a minus sign (-) if the number is negative; there must be no leading zeros before the decimal point apart possibly from the one required digit immediately before the decimal point; beyond the one required digit after the decimal point there must be as many, but only as many, more digits as are needed to uniquely distinguish the number from all other IEEE 754 numeric values.
The boolean false value is converted to the string false. The boolean true value is converted to the string true.
An object of a type other than the four basic types is converted to a string in a way that is dependent on that type.
Returns a complex which denotes the string form. The parser ignores leading whitespaces and trailing garbage. Any digit sequences can be separated by an underscore. Returns zero for null or garbage string.
'9'.to_c #=> (9+0i) '2.5'.to_c #=> (2.5+0i) '2.5/1'.to_c #=> ((5/2)+0i) '-3/2'.to_c #=> ((-3/2)+0i) '-i'.to_c #=> (0-1i) '45i'.to_c #=> (0+45i) '3-4i'.to_c #=> (3-4i) '-4e2-4e-2i'.to_c #=> (-400.0-0.04i) '-0.0-0.0i'.to_c #=> (-0.0-0.0i) '1/2+3/4i'.to_c #=> ((1/2)+(3/4)*i) 'ruby'.to_c #=> (0+0i)
See Kernel.Complex
.
Returns the result of interpreting leading characters in str
as a BigDecimal
.
require 'bigdecimal' require 'bigdecimal/util' "0.5".to_d # => 0.5e0 "123.45e1".to_d # => 0.12345e4 "45.67 degrees".to_d # => 0.4567e2
See also BigDecimal::new
.
Scans the current string until the match is exhausted yielding each match as it is encountered in the string. A block is not necessary as the results will simply be aggregated into the final array.
"123 456".block_scanf("%d") # => [123, 456]
If a block is given, the value from that is returned from the yield is added to an output array.
"123 456".block_scanf("%d) do |digit,| # the ',' unpacks the Array digit + 100 end # => [223, 556]
See Scanf
for details on creating a format string.
You will need to require ‘scanf’ to use String#block_scanf
Returns a normalized form of str
, using Unicode normalizations NFC, NFD, NFKC, or NFKD. The normalization form used is determined by form
, which is any of the four values :nfc, :nfd, :nfkc, or :nfkd. The default is :nfc.
If the string is not in a Unicode Encoding
, then an Exception
is raised. In this context, ‘Unicode Encoding’ means any of UTF-8, UTF-16BE/LE, and UTF-32BE/LE, as well as GB18030, UCS_2BE, and UCS_4BE. Anything else than UTF-8 is implemented by converting to UTF-8, which makes it slower than UTF-8.
Examples
"a\u0300".unicode_normalize #=> 'à' (same as "\u00E0") "a\u0300".unicode_normalize(:nfc) #=> 'à' (same as "\u00E0") "\u00E0".unicode_normalize(:nfd) #=> 'à' (same as "a\u0300") "\xE0".force_encoding('ISO-8859-1').unicode_normalize(:nfd) #=> Encoding::CompatibilityError raised
Destructive version of String#unicode_normalize
, doing Unicode normalization in place.
Checks whether str
is in Unicode normalization form form
, which is any of the four values :nfc, :nfd, :nfkc, or :nfkd. The default is :nfc.
If the string is not in a Unicode Encoding
, then an Exception
is raised. For details, see String#unicode_normalize
.
Examples
"a\u0300".unicode_normalized? #=> false "a\u0300".unicode_normalized?(:nfd) #=> true "\u00E0".unicode_normalized? #=> true "\u00E0".unicode_normalized?(:nfd) #=> false "\xE0".force_encoding('ISO-8859-1').unicode_normalized? #=> Encoding::CompatibilityError raised
Returns a rational which denotes the string form. The parser ignores leading whitespaces and trailing garbage. Any digit sequences can be separated by an underscore. Returns zero for null or garbage string.
NOTE: ‘0.3’.to_r isn’t the same as 0.3.to_r. The former is equivalent to ‘3/10’.to_r, but the latter isn’t so.
' 2 '.to_r #=> (2/1) '300/2'.to_r #=> (150/1) '-9.2'.to_r #=> (-46/5) '-9.2e2'.to_r #=> (-920/1) '1_234_567'.to_r #=> (1234567/1) '21 june 09'.to_r #=> (21/1) '21/06/09'.to_r #=> (7/2) 'bwv 1079'.to_r #=> (0/1)
See Kernel.Rational
.
Returns the result of interpreting leading characters in str as an integer base base (between 2 and 36). Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0
is returned. This method never raises an exception when base is valid.
"12345".to_i #=> 12345 "99 red balloons".to_i #=> 99 "0a".to_i #=> 0 "0a".to_i(16) #=> 10 "hello".to_i #=> 0 "1100101".to_i(2) #=> 101 "1100101".to_i(8) #=> 294977 "1100101".to_i(10) #=> 1100101 "1100101".to_i(16) #=> 17826049
Returns the result of interpreting leading characters in str as a floating point number. Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0.0
is returned. This method never raises an exception.
"123.45e1".to_f #=> 1234.5 "45.67 degrees".to_f #=> 45.67 "thx1138".to_f #=> 0.0
Returns self
.
If called on a subclass of String, converts the receiver to a String object.
Returns the Symbol
corresponding to str, creating the symbol if it did not previously exist. See Symbol#id2name
.
"Koala".intern #=> :Koala s = 'cat'.to_sym #=> :cat s == :cat #=> true s = '@cat'.to_sym #=> :@cat s == :@cat #=> true
This can also be used to create symbols that cannot be represented using the :xxx
notation.
'cat and dog'.to_sym #=> :"cat and dog"
Returns true if str
ends with one of the suffixes
given.
"hello".end_with?("ello") #=> true # returns true if one of the +suffixes+ matches. "hello".end_with?("heaven", "ello") #=> true "hello".end_with?("heaven", "paradise") #=> false