Splits str
into an array of tokens in the same way the UNIX Bourne shell does.
See Shellwords.shellsplit
for details.
Escapes str
so that it can be safely used in a Bourne shell command line.
See Shellwords.shellescape
for details.
Decodes str (which may contain binary data) according to the format string, returning an array of each value extracted. The format string consists of a sequence of single-character directives, summarized in the table at the end of this entry. Each directive may be followed by a number, indicating the number of times to repeat with this directive. An asterisk (“*
”) will use up all remaining elements. The directives sSiIlL
may each be followed by an underscore (“_
”) or exclamation mark (“!
”) to use the underlying platform’s native size for the specified type; otherwise, it uses a platform-independent consistent size. Spaces are ignored in the format string. See also String#unpack1
, Array#pack
.
"abc \0\0abc \0\0".unpack('A6Z6') #=> ["abc", "abc "] "abc \0\0".unpack('a3a3') #=> ["abc", " \000\000"] "abc \0abc \0".unpack('Z*Z*') #=> ["abc ", "abc "] "aa".unpack('b8B8') #=> ["10000110", "01100001"] "aaa".unpack('h2H2c') #=> ["16", "61", 97] "\xfe\xff\xfe\xff".unpack('sS') #=> [-2, 65534] "now=20is".unpack('M*') #=> ["now is"] "whole".unpack('xax2aX2aX1aX2a') #=> ["h", "e", "l", "l", "o"]
This table summarizes the various formats and the Ruby classes returned by each.
Integer | | Directive | Returns | Meaning ------------------------------------------------------------------ C | Integer | 8-bit unsigned (unsigned char) S | Integer | 16-bit unsigned, native endian (uint16_t) L | Integer | 32-bit unsigned, native endian (uint32_t) Q | Integer | 64-bit unsigned, native endian (uint64_t) J | Integer | pointer width unsigned, native endian (uintptr_t) | | c | Integer | 8-bit signed (signed char) s | Integer | 16-bit signed, native endian (int16_t) l | Integer | 32-bit signed, native endian (int32_t) q | Integer | 64-bit signed, native endian (int64_t) j | Integer | pointer width signed, native endian (intptr_t) | | S_ S! | Integer | unsigned short, native endian I I_ I! | Integer | unsigned int, native endian L_ L! | Integer | unsigned long, native endian Q_ Q! | Integer | unsigned long long, native endian (ArgumentError | | if the platform has no long long type.) J! | Integer | uintptr_t, native endian (same with J) | | s_ s! | Integer | signed short, native endian i i_ i! | Integer | signed int, native endian l_ l! | Integer | signed long, native endian q_ q! | Integer | signed long long, native endian (ArgumentError | | if the platform has no long long type.) j! | Integer | intptr_t, native endian (same with j) | | S> s> S!> s!> | Integer | same as the directives without ">" except L> l> L!> l!> | | big endian I!> i!> | | Q> q> Q!> q!> | | "S>" is same as "n" J> j> J!> j!> | | "L>" is same as "N" | | S< s< S!< s!< | Integer | same as the directives without "<" except L< l< L!< l!< | | little endian I!< i!< | | Q< q< Q!< q!< | | "S<" is same as "v" J< j< J!< j!< | | "L<" is same as "V" | | n | Integer | 16-bit unsigned, network (big-endian) byte order N | Integer | 32-bit unsigned, network (big-endian) byte order v | Integer | 16-bit unsigned, VAX (little-endian) byte order V | Integer | 32-bit unsigned, VAX (little-endian) byte order | | U | Integer | UTF-8 character w | Integer | BER-compressed integer (see Array#pack) Float | | Directive | Returns | Meaning ----------------------------------------------------------------- D d | Float | double-precision, native format F f | Float | single-precision, native format E | Float | double-precision, little-endian byte order e | Float | single-precision, little-endian byte order G | Float | double-precision, network (big-endian) byte order g | Float | single-precision, network (big-endian) byte order String | | Directive | Returns | Meaning ----------------------------------------------------------------- A | String | arbitrary binary string (remove trailing nulls and ASCII spaces) a | String | arbitrary binary string Z | String | null-terminated string B | String | bit string (MSB first) b | String | bit string (LSB first) H | String | hex string (high nibble first) h | String | hex string (low nibble first) u | String | UU-encoded string M | String | quoted-printable, MIME encoding (see RFC2045) m | String | base64 encoded string (RFC 2045) (default) | | base64 encoded string (RFC 4648) if followed by 0 P | String | pointer to a structure (fixed-length string) p | String | pointer to a null-terminated string Misc. | | Directive | Returns | Meaning ----------------------------------------------------------------- @ | --- | skip to the offset given by the length argument X | --- | skip backward one byte x | --- | skip forward one byte
HISTORY
J, J! j, and j! are available since Ruby 2.3.
Q_, Q!, q_, and q! are available since Ruby 2.1.
I!<, i!<, I!>, and i!> are available since Ruby 1.9.3.
Decodes str (which may contain binary data) according to the format string, returning the first value extracted. See also String#unpack
, Array#pack
.
Contrast with String#unpack
:
"abc \0\0abc \0\0".unpack('A6Z6') #=> ["abc", "abc "] "abc \0\0abc \0\0".unpack1('A6Z6') #=> "abc"
In that case data would be lost but often it’s the case that the array only holds one value, especially when unpacking binary data. For instance:
“xffx00x00x00”.unpack(“l”) #=> [255] “xffx00x00x00”.unpack1(“l”) #=> 255
Thus unpack1 is convenient, makes clear the intention and signals the expected return value to those reading the code.
Returns a new String that is a copy of string
.
With no arguments, returns the empty string with the Encoding
ASCII-8BIT
:
s = String.new s # => "" s.encoding # => #<Encoding:ASCII-8BIT>
With the single String argument string
, returns a copy of string
with the same encoding as string
:
s = String.new("Que veut dire \u{e7}a?") s # => "Que veut dire \u{e7}a?" s.encoding # => #<Encoding:UTF-8>
Literal strings like ""
or here-documents always use script encoding, unlike String.new
.
With keyword encoding
, returns a copy of str
with the specified encoding:
s = String.new(encoding: 'ASCII') s.encoding # => #<Encoding:US-ASCII> s = String.new('foo', encoding: 'ASCII') s.encoding # => #<Encoding:US-ASCII>
Note that these are equivalent:
s0 = String.new('foo', encoding: 'ASCII') s1 = 'foo'.force_encoding('ASCII') s0.encoding == s1.encoding # => true
With keyword capacity
, returns a copy of str
; the given capacity
may set the size of the internal buffer, which may affect performance:
String.new(capacity: 1) # => "" String.new(capacity: 4096) # => ""
The string
, encoding
, and capacity
arguments may all be used together:
String.new('hello', encoding: 'UTF-8', capacity: 25)
Compares self
and other_string
, returning:
-1 if other_string
is larger.
0 if the two are equal.
1 if other_string
is smaller.
nil
if the two are incomparable.
Examples:
'foo' <=> 'foo' # => 0 'foo' <=> 'food' # => -1 'food' <=> 'foo' # => 1 'FOO' <=> 'foo' # => -1 'foo' <=> 'FOO' # => 1 'foo' <=> 1 # => nil
Returns true
if object
has the same length and content; as self
; false
otherwise:
s = 'foo' s == 'foo' # => true s == 'food' # => false s == 'FOO' # => false
Returns false
if the two strings’ encodings are not compatible:
"\u{e4 f6 fc}".encode("ISO-8859-1") == ("\u{c4 d6 dc}") # => false
If object
is not an instance of String but responds to to_str
, then the two strings are compared using object.==
.
Returns true
if object
has the same length and content; as self
; false
otherwise:
s = 'foo' s == 'foo' # => true s == 'food' # => false s == 'FOO' # => false
Returns false
if the two strings’ encodings are not compatible:
"\u{e4 f6 fc}".encode("ISO-8859-1") == ("\u{c4 d6 dc}") # => false
If object
is not an instance of String but responds to to_str
, then the two strings are compared using object.==
.
Returns true
if object
has the same length and content; as self
; false
otherwise:
s = 'foo' s.eql?('foo') # => true s.eql?('food') # => false s.eql?('FOO') # => false
Returns false
if the two strings’ encodings are not compatible:
"\u{e4 f6 fc}".encode("ISO-8859-1").eql?("\u{c4 d6 dc}") # => false
Returns the integer hash value for self
. The value is based on the length, content and encoding of self
.
Compares self
and other_string
, ignoring case, and returning:
-1 if other_string
is larger.
0 if the two are equal.
1 if other_string
is smaller.
nil
if the two are incomparable.
Examples:
'foo'.casecmp('foo') # => 0 'foo'.casecmp('food') # => -1 'food'.casecmp('foo') # => 1 'FOO'.casecmp('foo') # => 0 'foo'.casecmp('FOO') # => 0 'foo'.casecmp(1) # => nil
Returns true
if self
and other_string
are equal after Unicode case folding, otherwise false
:
'foo'.casecmp?('foo') # => true 'foo'.casecmp?('food') # => false 'food'.casecmp?('foo') # => true 'FOO'.casecmp?('foo') # => true 'foo'.casecmp?('FOO') # => true
Returns nil
if the two values are incomparable:
'foo'.casecmp?(1) # => nil
Returns a new String containing other_string
concatenated to self
:
"Hello from " + self.to_s # => "Hello from main"
Returns a new String containing integer
copies of self
:
"Ho! " * 3 # => "Ho! Ho! Ho! " "Ho! " * 0 # => ""
Returns the result of formatting object
into the format specification self
(see Kernel#sprintf
for formatting details):
"%05d" % 123 # => "00123"
If self
contains multiple substitutions, object
must be an Array or Hash containing the values to be substituted:
"%-5s: %016x" % [ "ID", self.object_id ] # => "ID : 00002b054ec93168" "foo = %{foo}" % {foo: 'bar'} # => "foo = bar" "foo = %{foo}, baz = %{baz}" % {foo: 'bar', baz: 'bat'} # => "foo = bar, baz = bat"
Returns the count of characters (not bytes) in self
:
"\x80\u3042".length # => 2 "hello".length # => 5
String#size
is an alias for String#length
.
Related: String#bytesize
.
Returns the count of bytes in self
:
"\x80\u3042".bytesize # => 4 "hello".bytesize # => 5
Related: String#length
.
Returns true
if the length of self
is zero, false
otherwise:
"hello".empty? # => false " ".empty? # => false "".empty? # => true
Returns the Integer index of the first substring that matches the given regexp
, or nil
if no match found:
'foo' =~ /f/ # => 0 'foo' =~ /o/ # => 1 'foo' =~ /x/ # => nil
Note: also updates Regexp-related global variables.
If the given object
is not a Regexp, returns the value returned by object =~ self
.
Note that string =~ regexp
is different from regexp =~ string
(see Regexp#=~):
number= nil "no. 9" =~ /(?<number>\d+)/ number # => nil (not assigned) /(?<number>\d+)/ =~ "no. 9" number #=> "9"
Returns a Matchdata object (or nil
) based on self
and the given pattern
.
Note: also updates Regexp-related global variables.
Computes regexp
by converting pattern
(if not already a Regexp).
regexp = Regexp.new(pattern)
Computes matchdata
, which will be either a MatchData object or nil
(see Regexp#match
):
matchdata = <tt>regexp.match(self)
With no block given, returns the computed matchdata
:
'foo'.match('f') # => #<MatchData "f"> 'foo'.match('o') # => #<MatchData "o"> 'foo'.match('x') # => nil
If Integer argument offset
is given, the search begins at index offset
:
'foo'.match('f', 1) # => nil 'foo'.match('o', 1) # => #<MatchData "o">
With a block given, calls the block with the computed matchdata
and returns the block’s return value:
'foo'.match(/o/) {|matchdata| matchdata } # => #<MatchData "o"> 'foo'.match(/x/) {|matchdata| matchdata } # => nil 'foo'.match(/f/, 1) {|matchdata| matchdata } # => nil
Returns true
or false
based on whether a match is found for self
and pattern
.
Note: does not update Regexp-related global variables.
Computes regexp
by converting pattern
(if not already a Regexp).
regexp = Regexp.new(pattern)
Returns true
if self+.match(regexp)
returns a Matchdata object, false
otherwise:
'foo'.match?(/o/) # => true 'foo'.match?('o') # => true 'foo'.match?(/x/) # => false
If Integer argument offset
is given, the search begins at index offset
:
'foo'.match?('f', 1) # => false 'foo'.match?('o', 1) # => true
Returns the successor to self
. The successor is calculated by incrementing characters.
The first character to be incremented is the rightmost alphanumeric: or, if no alphanumerics, the rightmost character:
'THX1138'.succ # => "THX1139" '<<koala>>'.succ # => "<<koalb>>" '***'.succ # => '**+'
The successor to a digit is another digit, “carrying” to the next-left character for a “rollover” from 9 to 0, and prepending another digit if necessary:
'00'.succ # => "01" '09'.succ # => "10" '99'.succ # => "100"
The successor to a letter is another letter of the same case, carrying to the next-left character for a rollover, and prepending another same-case letter if necessary:
'aa'.succ # => "ab" 'az'.succ # => "ba" 'zz'.succ # => "aaa" 'AA'.succ # => "AB" 'AZ'.succ # => "BA" 'ZZ'.succ # => "AAA"
The successor to a non-alphanumeric character is the next character in the underlying character set’s collating sequence, carrying to the next-left character for a rollover, and prepending another character if necessary:
s = 0.chr * 3 s # => "\x00\x00\x00" s.succ # => "\x00\x00\x01" s = 255.chr * 3 s # => "\xFF\xFF\xFF" s.succ # => "\x01\x00\x00\x00"
Carrying can occur between and among mixtures of alphanumeric characters:
s = 'zz99zz99' s.succ # => "aaa00aa00" s = '99zz99zz' s.succ # => "100aa00aa"
The successor to an empty String is a new empty String:
''.succ # => ""
String#next
is an alias for String#succ
.
Equivalent to String#succ
, but modifies self
in place; returns self
.
String#next!
is an alias for String#succ!
.
Returns the successor to self
. The successor is calculated by incrementing characters.
The first character to be incremented is the rightmost alphanumeric: or, if no alphanumerics, the rightmost character:
'THX1138'.succ # => "THX1139" '<<koala>>'.succ # => "<<koalb>>" '***'.succ # => '**+'
The successor to a digit is another digit, “carrying” to the next-left character for a “rollover” from 9 to 0, and prepending another digit if necessary:
'00'.succ # => "01" '09'.succ # => "10" '99'.succ # => "100"
The successor to a letter is another letter of the same case, carrying to the next-left character for a rollover, and prepending another same-case letter if necessary:
'aa'.succ # => "ab" 'az'.succ # => "ba" 'zz'.succ # => "aaa" 'AA'.succ # => "AB" 'AZ'.succ # => "BA" 'ZZ'.succ # => "AAA"
The successor to a non-alphanumeric character is the next character in the underlying character set’s collating sequence, carrying to the next-left character for a rollover, and prepending another character if necessary:
s = 0.chr * 3 s # => "\x00\x00\x00" s.succ # => "\x00\x00\x01" s = 255.chr * 3 s # => "\xFF\xFF\xFF" s.succ # => "\x01\x00\x00\x00"
Carrying can occur between and among mixtures of alphanumeric characters:
s = 'zz99zz99' s.succ # => "aaa00aa00" s = '99zz99zz' s.succ # => "100aa00aa"
The successor to an empty String is a new empty String:
''.succ # => ""
String#next
is an alias for String#succ
.
Equivalent to String#succ
, but modifies self
in place; returns self
.
String#next!
is an alias for String#succ!
.