Scans the current string. If a block is given, it functions exactly like block_scanf.
arr = "123 456".scanf("%d%d") # => [123, 456] require 'pp' "this 123 read that 456 other".scanf("%s%d%s") {|m| pp m} # ["this", 123, "read"] # ["that", 456, "other"] # => [["this", 123, "read"], ["that", 456, "other"]]
See Scanf
for details on creating a format string.
You will need to require ‘scanf’ to use String#scanf
Splits str
into an array of tokens in the same way the UNIX Bourne shell does.
See Shellwords.shellsplit
for details.
Escapes str
so that it can be safely used in a Bourne shell command line.
See Shellwords.shellescape
for details.
Decodes str (which may contain binary data) according to the format string, returning an array of each value extracted. The format string consists of a sequence of single-character directives, summarized in the table at the end of this entry. Each directive may be followed by a number, indicating the number of times to repeat with this directive. An asterisk (“*
”) will use up all remaining elements. The directives sSiIlL
may each be followed by an underscore (“_
”) or exclamation mark (“!
”) to use the underlying platform’s native size for the specified type; otherwise, it uses a platform-independent consistent size. Spaces are ignored in the format string. See also String#unpack1
, Array#pack
.
"abc \0\0abc \0\0".unpack('A6Z6') #=> ["abc", "abc "] "abc \0\0".unpack('a3a3') #=> ["abc", " \000\000"] "abc \0abc \0".unpack('Z*Z*') #=> ["abc ", "abc "] "aa".unpack('b8B8') #=> ["10000110", "01100001"] "aaa".unpack('h2H2c') #=> ["16", "61", 97] "\xfe\xff\xfe\xff".unpack('sS') #=> [-2, 65534] "now=20is".unpack('M*') #=> ["now is"] "whole".unpack('xax2aX2aX1aX2a') #=> ["h", "e", "l", "l", "o"]
This table summarizes the various formats and the Ruby classes returned by each.
Integer | | Directive | Returns | Meaning ------------------------------------------------------------------ C | Integer | 8-bit unsigned (unsigned char) S | Integer | 16-bit unsigned, native endian (uint16_t) L | Integer | 32-bit unsigned, native endian (uint32_t) Q | Integer | 64-bit unsigned, native endian (uint64_t) J | Integer | pointer width unsigned, native endian (uintptr_t) | | c | Integer | 8-bit signed (signed char) s | Integer | 16-bit signed, native endian (int16_t) l | Integer | 32-bit signed, native endian (int32_t) q | Integer | 64-bit signed, native endian (int64_t) j | Integer | pointer width signed, native endian (intptr_t) | | S_ S! | Integer | unsigned short, native endian I I_ I! | Integer | unsigned int, native endian L_ L! | Integer | unsigned long, native endian Q_ Q! | Integer | unsigned long long, native endian (ArgumentError | | if the platform has no long long type.) J! | Integer | uintptr_t, native endian (same with J) | | s_ s! | Integer | signed short, native endian i i_ i! | Integer | signed int, native endian l_ l! | Integer | signed long, native endian q_ q! | Integer | signed long long, native endian (ArgumentError | | if the platform has no long long type.) j! | Integer | intptr_t, native endian (same with j) | | S> s> S!> s!> | Integer | same as the directives without ">" except L> l> L!> l!> | | big endian I!> i!> | | Q> q> Q!> q!> | | "S>" is same as "n" J> j> J!> j!> | | "L>" is same as "N" | | S< s< S!< s!< | Integer | same as the directives without "<" except L< l< L!< l!< | | little endian I!< i!< | | Q< q< Q!< q!< | | "S<" is same as "v" J< j< J!< j!< | | "L<" is same as "V" | | n | Integer | 16-bit unsigned, network (big-endian) byte order N | Integer | 32-bit unsigned, network (big-endian) byte order v | Integer | 16-bit unsigned, VAX (little-endian) byte order V | Integer | 32-bit unsigned, VAX (little-endian) byte order | | U | Integer | UTF-8 character w | Integer | BER-compressed integer (see Array.pack) Float | | Directive | Returns | Meaning ----------------------------------------------------------------- D d | Float | double-precision, native format F f | Float | single-precision, native format E | Float | double-precision, little-endian byte order e | Float | single-precision, little-endian byte order G | Float | double-precision, network (big-endian) byte order g | Float | single-precision, network (big-endian) byte order String | | Directive | Returns | Meaning ----------------------------------------------------------------- A | String | arbitrary binary string (remove trailing nulls and ASCII spaces) a | String | arbitrary binary string Z | String | null-terminated string B | String | bit string (MSB first) b | String | bit string (LSB first) H | String | hex string (high nibble first) h | String | hex string (low nibble first) u | String | UU-encoded string M | String | quoted-printable, MIME encoding (see RFC2045) m | String | base64 encoded string (RFC 2045) (default) | | base64 encoded string (RFC 4648) if followed by 0 P | String | pointer to a structure (fixed-length string) p | String | pointer to a null-terminated string Misc. | | Directive | Returns | Meaning ----------------------------------------------------------------- @ | --- | skip to the offset given by the length argument X | --- | skip backward one byte x | --- | skip forward one byte
HISTORY
J, J! j, and j! are available since Ruby 2.3.
Q_, Q!, q_, and q! are available since Ruby 2.1.
I!<, i!<, I!>, and i!> are available since Ruby 1.9.3.
Decodes str (which may contain binary data) according to the format string, returning the first value extracted. See also String#unpack
, Array#pack
.
Returns a new string object containing a copy of str.
The optional encoding keyword argument specifies the encoding of the new string. If not specified, the encoding of str is used (or ASCII-8BIT, if str is not specified).
The optional capacity keyword argument specifies the size of the internal buffer. This may improve performance, when the string will be concatenated many times (causing many realloc calls).
Comparison—Returns -1, 0, +1, or nil
depending on whether string
is less than, equal to, or greater than other_string
.
nil
is returned if the two values are incomparable.
If the strings are of different lengths, and the strings are equal when compared up to the shortest length, then the longer string is considered greater than the shorter one.
<=>
is the basis for the methods <
, <=
, >
, >=
, and between?
, included from module Comparable
. The method String#==
does not use Comparable#==
.
"abcdef" <=> "abcde" #=> 1 "abcdef" <=> "abcdef" #=> 0 "abcdef" <=> "abcdefg" #=> -1 "abcdef" <=> "ABCDEF" #=> 1 "abcdef" <=> 1 #=> nil
Equality—Returns whether str
== obj
, similar to Object#==
.
If obj
is not an instance of String but responds to to_str
, then the two strings are compared using obj.==
.
Otherwise, returns similarly to String#eql?
, comparing length and content.
Equality—Returns whether str
== obj
, similar to Object#==
.
If obj
is not an instance of String but responds to to_str
, then the two strings are compared using obj.==
.
Otherwise, returns similarly to String#eql?
, comparing length and content.
Two strings are equal if they have the same length and content.
Case-insensitive version of String#<=>
. Currently, case-insensitivity only works on characters A-Z/a-z, not all of Unicode. This is different from String#casecmp?
.
"aBcDeF".casecmp("abcde") #=> 1 "aBcDeF".casecmp("abcdef") #=> 0 "aBcDeF".casecmp("abcdefg") #=> -1 "abcdef".casecmp("ABCDEF") #=> 0
nil
is returned if the two strings have incompatible encodings, or if other_str
is not a string.
"foo".casecmp(2) #=> nil "\u{e4 f6 fc}".encode("ISO-8859-1").casecmp("\u{c4 d6 dc}") #=> nil
Returns true
if str
and other_str
are equal after Unicode case folding, false
if they are not equal.
"aBcDeF".casecmp?("abcde") #=> false "aBcDeF".casecmp?("abcdef") #=> true "aBcDeF".casecmp?("abcdefg") #=> false "abcdef".casecmp?("ABCDEF") #=> true "\u{e4 f6 fc}".casecmp?("\u{c4 d6 dc}") #=> true
nil
is returned if the two strings have incompatible encodings, or if other_str
is not a string.
"foo".casecmp?(2) #=> nil "\u{e4 f6 fc}".encode("ISO-8859-1").casecmp?("\u{c4 d6 dc}") #=> nil
Concatenation—Returns a new String
containing other_str concatenated to str.
"Hello from " + self.to_s #=> "Hello from main"
Copy — Returns a new String containing integer
copies of the receiver. integer
must be greater than or equal to 0.
"Ho! " * 3 #=> "Ho! Ho! Ho! " "Ho! " * 0 #=> ""
Format—Uses str as a format specification, and returns the result of applying it to arg. If the format specification contains more than one substitution, then arg must be an Array
or Hash
containing the values to be substituted. See Kernel::sprintf
for details of the format string.
"%05d" % 123 #=> "00123" "%-5s: %08x" % [ "ID", self.object_id ] #=> "ID : 200e14d6" "foo = %{foo}" % { :foo => 'bar' } #=> "foo = bar"
Returns the character length of str.
Returns the length of str
in bytes.
"\x80\u3042".bytesize #=> 4 "hello".bytesize #=> 5
Returns true
if str has a length of zero.
"hello".empty? #=> false " ".empty? #=> false "".empty? #=> true
Match—If obj is a Regexp
, use it as a pattern to match against str,and returns the position the match starts, or nil
if there is no match. Otherwise, invokes obj.=~, passing str as an argument. The default =~
in Object
returns nil
.
Note: str =~ regexp
is not the same as regexp =~ str
. Strings captured from named capture groups are assigned to local variables only in the second case.
"cat o' 9 tails" =~ /\d/ #=> 7 "cat o' 9 tails" =~ 9 #=> nil
Converts pattern to a Regexp
(if it isn’t already one), then invokes its match
method on str. If the second parameter is present, it specifies the position in the string to begin the search.
'hello'.match('(.)\1') #=> #<MatchData "ll" 1:"l"> 'hello'.match('(.)\1')[0] #=> "ll" 'hello'.match(/(.)\1/)[0] #=> "ll" 'hello'.match(/(.)\1/, 3) #=> nil 'hello'.match('xx') #=> nil
If a block is given, invoke the block with MatchData
if match succeed, so that you can write
str.match(pat) {|m| ...}
instead of
if m = str.match(pat) ... end
The return value is a value from block execution in this case.
Converts pattern to a Regexp
(if it isn’t already one), then returns a true
or false
indicates whether the regexp is matched str or not without updating $~
and other related variables. If the second parameter is present, it specifies the position in the string to begin the search.
"Ruby".match?(/R.../) #=> true "Ruby".match?(/R.../, 1) #=> false "Ruby".match?(/P.../) #=> false $& #=> nil
Returns the successor to str. The successor is calculated by incrementing characters starting from the rightmost alphanumeric (or the rightmost character if there are no alphanumerics) in the string. Incrementing a digit always results in another digit, and incrementing a letter results in another letter of the same case. Incrementing nonalphanumerics uses the underlying character set’s collating sequence.
If the increment generates a “carry,” the character to the left of it is incremented. This process repeats until there is no carry, adding an additional character if necessary.
"abcd".succ #=> "abce" "THX1138".succ #=> "THX1139" "<<koala>>".succ #=> "<<koalb>>" "1999zzz".succ #=> "2000aaa" "ZZZ9999".succ #=> "AAAA0000" "***".succ #=> "**+"
Equivalent to String#succ
, but modifies the receiver in place.
Returns the successor to str. The successor is calculated by incrementing characters starting from the rightmost alphanumeric (or the rightmost character if there are no alphanumerics) in the string. Incrementing a digit always results in another digit, and incrementing a letter results in another letter of the same case. Incrementing nonalphanumerics uses the underlying character set’s collating sequence.
If the increment generates a “carry,” the character to the left of it is incremented. This process repeats until there is no carry, adding an additional character if necessary.
"abcd".succ #=> "abce" "THX1138".succ #=> "THX1139" "<<koala>>".succ #=> "<<koalb>>" "1999zzz".succ #=> "2000aaa" "ZZZ9999".succ #=> "AAAA0000" "***".succ #=> "**+"