String

Class

A String object has an arbitrary sequence of bytes, typically representing text or binary data. A String object may be created using String::new or as literals.

String objects differ from Symbol objects in that Symbol objects are designed to be used as identifiers, instead of text or data.

You can create a String object explicitly with:

A string literal.
A heredoc literal.

You can convert certain objects to Strings with:

Method String.

Some String methods modify self. Typically, a method whose name ends with ! modifies self and returns self; often a similarly named method (without the !) returns a new string.

In general, if there exist both bang and non-bang version of method, the bang! mutates and the non-bang! does not. However, a method without a bang can also mutate, such as String#replace.

Substitution Methods

These methods perform substitutions:

String#sub: One substitution (or none); returns a new string.
String#sub!: One substitution (or none); returns self.
String#gsub: Zero or more substitutions; returns a new string.
String#gsub!: Zero or more substitutions; returns self.

Each of these methods takes:

A first argument, pattern (string or regexp), that specifies the substring(s) to be replaced.
Either of these:
- A second argument, replacement (string or hash), that determines the replacing string.
- A block that will determine the replacing string.

The examples in this section mostly use methods String#sub and String#gsub; the principles illustrated apply to all four substitution methods.

Argument pattern

Argument pattern is commonly a regular expression:

s = 'hello'
s.sub(/[aeiou]/, '*')  # => "h*llo"
s.gsub(/[aeiou]/, '*') # => "h*ll*"
s.gsub(/[aeiou]/, '')  # => "hll"
s.sub(/ell/, 'al')     # => "halo"
s.gsub(/xyzzy/, '*')   # => "hello"
'THX1138'.gsub(/\d+/, '00') # => "THX00"

When pattern is a string, all its characters are treated as ordinary characters (not as regexp special characters):

'THX1138'.gsub('\d+', '00') # => "THX1138"

String replacement

If replacement is a string, that string will determine the replacing string that is to be substituted for the matched text.

Each of the examples above uses a simple string as the replacing string.

String replacement may contain back-references to the pattern’s captures:

\n (n a non-negative integer) refers to $n.
\k<name> refers to the named capture name.

See regexp.rdoc for details.

Note that within the string replacement, a character combination such as $& is treated as ordinary text, and not as a special match variable. However, you may refer to some special match variables using these combinations:

\& and \0 correspond to $&, which contains the complete matched text.
\' corresponds to $', which contains string after match.
\` corresponds to $`, which contains string before match.
+ corresponds to $+, which contains last capture group.

See regexp.rdoc for details.

Note that \\ is interpreted as an escape, i.e., a single backslash.

Note also that a string literal consumes backslashes. See String Literals for details about string literals.

A back-reference is typically preceded by an additional backslash. For example, if you want to write a back-reference \& in replacement with a double-quoted string literal, you need to write "..\\&..".

If you want to write a non-back-reference string \& in replacement, you need first to escape the backslash to prevent this method from interpreting it as a back-reference, and then you need to escape the backslashes again to prevent a string literal from consuming them: "..\\\\&..".

You may want to use the block form to avoid a lot of backslashes.

Hash replacement

If argument replacement is a hash, and pattern matches one of its keys, the replacing string is the value for that key:

h = {'foo' => 'bar', 'baz' => 'bat'}
'food'.sub('foo', h) # => "bard"

Note that a symbol key does not match:

h = {foo: 'bar', baz: 'bat'}
'food'.sub('foo', h) # => "d"

Block

In the block form, the current match string is passed to the block; the block’s return value becomes the replacing string:

 s = '@'
'1234'.gsub(/\d/) {|match| s.succ! } # => "ABCD"

Special match variables such as $1, $2, $`, $&, and $' are set appropriately.

What’s Here

First, what’s elsewhere. Class String:

Inherits from class Object.
Includes module Comparable.

Here, class String provides methods that are useful for:

Creating a String
Frozen/Unfrozen Strings
Querying
Comparing
Modifying a String
Converting to New String
Converting to Non-String
Iterating

Methods for Creating a String

::new

Returns a new string.
::try_convert

Returns a new string created from a given object.

Methods for a Frozen/Unfrozen `String`

#+string

Returns a string that is not frozen: self, if not frozen; self.dup otherwise.
#-string

Returns a string that is frozen: self, if already frozen; self.freeze otherwise.
freeze

Freezes self, if not already frozen; returns self.

Methods for Querying

Counts

length, size

Returns the count of characters (not bytes).
empty?

Returns true if self.length is zero; false otherwise.
bytesize

Returns the count of bytes.
count

Returns the count of substrings matching given strings.

Substrings

#=~

Returns the index of the first substring that matches a given Regexp or other object; returns nil if no match is found.
index

Returns the index of the first occurrence of a given substring; returns nil if none found.
rindex

Returns the index of the last occurrence of a given substring; returns nil if none found.
include?

Returns true if the string contains a given substring; false otherwise.
match

Returns a MatchData object if the string matches a given Regexp; nil otherwise.
match?

Returns true if the string matches a given Regexp; false otherwise.
start_with?

Returns true if the string begins with any of the given substrings.
end_with?

Returns true if the string ends with any of the given substrings.

Encodings

encoding

Returns the Encoding object that represents the encoding of the string.
unicode_normalized?

Returns true if the string is in Unicode normalized form; false otherwise.
valid_encoding?

Returns true if the string contains only characters that are valid for its encoding.
ascii_only?

Returns true if the string has only ASCII characters; false otherwise.

Other

sum

Returns a basic checksum for the string: the sum of each byte.
hash

Returns the integer hash code.

Methods for Comparing

#==, #===

Returns true if a given other string has the same content as self.
eql?

Returns true if the content is the same as the given other string.
#<=>

Returns -1, 0, or 1 as a given other string is smaller than, equal to, or larger than self.
casecmp

Ignoring case, returns -1, 0, or 1 as a given other string is smaller than, equal to, or larger than self.
casecmp?

Returns true if the string is equal to a given string after Unicode case folding; false otherwise.

Methods for Modifying a String

Each of these methods modifies self.

Insertion

insert

Returns self with a given string inserted at a given offset.
<<

Returns self concatenated with a given string or integer.

Substitution

sub!

Replaces the first substring that matches a given pattern with a given replacement string; returns self if any changes, nil otherwise.
gsub!

Replaces each substring that matches a given pattern with a given replacement string; returns self if any changes, nil otherwise.
succ!, next!

Returns self modified to become its own successor.
replace

Returns self with its entire content replaced by a given string.
reverse!

Returns self with its characters in reverse order.
setbyte

Sets the byte at a given integer offset to a given value; returns the argument.
tr!

Replaces specified characters in self with specified replacement characters; returns self if any changes, nil otherwise.
tr_s!

Replaces specified characters in self with specified replacement characters, removing duplicates from the substrings that were modified; returns self if any changes, nil otherwise.

Casing

capitalize!

Upcases the initial character and downcases all others; returns self if any changes, nil otherwise.
downcase!

Downcases all characters; returns self if any changes, nil otherwise.
upcase!

Upcases all characters; returns self if any changes, nil otherwise.
swapcase!

Upcases each downcase character and downcases each upcase character; returns self if any changes, nil otherwise.

Encoding

encode!

Returns self with all characters transcoded from one given encoding into another.
unicode_normalize!

Unicode-normalizes self; returns self.
scrub!

Replaces each invalid byte with a given character; returns self.
force_encoding

Changes the encoding to a given encoding; returns self.

Deletion

clear

Removes all content, so that self is empty; returns self.
slice!, []=

Removes a substring determined by a given index, start/length, range, regexp, or substring.
squeeze!

Removes contiguous duplicate characters; returns self.
delete!

Removes characters as determined by the intersection of substring arguments.
lstrip!

Removes leading whitespace; returns self if any changes, nil otherwise.
rstrip!

Removes trailing whitespace; returns self if any changes, nil otherwise.
strip!

Removes leading and trailing whitespace; returns self if any changes, nil otherwise.
chomp!

Removes trailing record separator, if found; returns self if any changes, nil otherwise.
chop!

Removes trailing whitespace if found, otherwise removes the last character; returns self if any changes, nil otherwise.

Methods for Converting to New String

Each of these methods returns a new String based on self, often just a modified copy of self.

Extension

*

Returns the concatenation of multiple copies of self,
+

Returns the concatenation of self and a given other string.
center

Returns a copy of self centered between pad substring.
concat

Returns the concatenation of self with given other strings.
prepend

Returns the concatenation of a given other string with self.
ljust

Returns a copy of self of a given length, right-padded with a given other string.
rjust

Returns a copy of self of a given length, left-padded with a given other string.

Encoding

b

Returns a copy of self with ASCII-8BIT encoding.
scrub

Returns a copy of self with each invalid byte replaced with a given character.
unicode_normalize

Returns a copy of self with each character Unicode-normalized.
encode

Returns a copy of self with all characters transcoded from one given encoding into another.

Substitution

dump

Returns a copy of +self with all non-printing characters replaced by xHH notation and all special characters escaped.
undump

Returns a copy of +self with all \xNN notation replace by \uNNNN notation and all escaped characters unescaped.
sub

Returns a copy of self with the first substring matching a given pattern replaced with a given replacement string;.
gsub

Returns a copy of self with each substring that matches a given pattern replaced with a given replacement string.
succ, next

Returns the string that is the successor to self.
reverse

Returns a copy of self with its characters in reverse order.
tr

Returns a copy of self with specified characters replaced with specified replacement characters.
tr_s

Returns a copy of self with specified characters replaced with specified replacement characters, removing duplicates from the substrings that were modified.
%

Returns the string resulting from formatting a given object into self

Casing

capitalize

Returns a copy of self with the first character upcased and all other characters downcased.
downcase

Returns a copy of self with all characters downcased.
upcase

Returns a copy of self with all characters upcased.
swapcase

Returns a copy of self with all upcase characters downcased and all downcase characters upcased.

Deletion

delete

Returns a copy of self with characters removed
delete_prefix

Returns a copy of self with a given prefix removed.
delete_suffix

Returns a copy of self with a given suffix removed.
lstrip

Returns a copy of self with leading whitespace removed.
rstrip

Returns a copy of self with trailing whitespace removed.
strip

Returns a copy of self with leading and trailing whitespace removed.
chomp

Returns a copy of self with a trailing record separator removed, if found.
chop

Returns a copy of self with trailing whitespace or the last character removed.
squeeze

Returns a copy of self with contiguous duplicate characters removed.
[], slice

Returns a substring determined by a given index, start/length, or range, or string.
byteslice

Returns a substring determined by a given index, start/length, or range.
chr

Returns the first character.

Duplication

to_s, $to_str

If self is a subclass of String, returns self copied into a String; otherwise, returns self.

Methods for Converting to Non-String

Each of these methods converts the contents of self to a non-String.

Characters, Bytes, and Clusters

bytes

Returns an array of the bytes in self.
chars

Returns an array of the characters in self.
codepoints

Returns an array of the integer ordinals in self.
getbyte

Returns an integer byte as determined by a given index.
grapheme_clusters

Returns an array of the grapheme clusters in self.

Splitting

lines

Returns an array of the lines in self, as determined by a given record separator.
partition

Returns a 3-element array determined by the first substring that matches a given substring or regexp,
rpartition

Returns a 3-element array determined by the last substring that matches a given substring or regexp,
split

Returns an array of substrings determined by a given delimiter – regexp or string – or, if a block given, passes those substrings to the block.

Matching

scan

Returns an array of substrings matching a given regexp or string, or, if a block given, passes each matching substring to the block.
unpack

Returns an array of substrings extracted from self according to a given format.
unpack1

Returns the first substring extracted from self according to a given format.

Numerics

hex

Returns the integer value of the leading characters, interpreted as hexadecimal digits.
oct

Returns the integer value of the leading characters, interpreted as octal digits.
ord

Returns the integer ordinal of the first character in self.
to_i

Returns the integer value of leading characters, interpreted as an integer.
to_f

Returns the floating-point value of leading characters, interpreted as a floating-point number.

Strings and Symbols

inspect

Returns copy of self, enclosed in double-quotes, with special characters escaped.
to_sym, intern

Returns the symbol corresponding to self.

Methods for Iterating

each_byte

Calls the given block with each successive byte in self.
each_char

Calls the given block with each successive character in self.
each_codepoint

Calls the given block with each successive integer codepoint in self.
each_grapheme_cluster

Calls the given block with each successive grapheme cluster in self.
each_line

Calls the given block with each successive line in self, as determined by a given record separator.
upto

Calls the given block with each string value returned by successive calls to succ.

Class Methods

String

Substitution Methods

What’s Here

Methods for Creating a String

Methods for a Frozen/Unfrozen String

Methods for Querying

Methods for Comparing

Methods for Modifying a String

Methods for Converting to New String

Methods for Converting to Non-String

Methods for Iterating

String.new(string = '') → new_string

String.new(string = '', encoding: encoding) → new_string

String.new(string = '', capacity: size) → new_string

String.try_convert(object) → object, new_string, or nil

string << object → string

string <=> other_string → -1, 0, 1, or nil

string =~ regexp → integer or nil

string =~ object → integer or nil

string == object → true or false

string === object → true or false

-string → frozen_string

string[index] → new_string or nil

string[start, length] → new_string or nil

string[range] → new_string or nil

string[regexp, capture = 0] → new_string or nil

string[substring] → new_string or nil

str[integer] = new_str

str[integer, integer] = new_str

str[range] = aString

str[regexp] = new_str

str[regexp, integer] = new_str

str[regexp, name] = new_str

str[other_str] = new_str

string * integer → new_string

string % object → new_string

string + other_string → new_string

+string → new_string or self

str.ascii_only? → true or false

str.b → str

str.bytes → an_array

bytesize → integer

byteslice(index, length = 1) → string or nil

byteslice(range) → string or nil

capitalize(*options) → string

capitalize!(*options) → self or nil

casecmp(other_string) → -1, 0, 1, or nil

casecmp?(other_string) → true, false, or nil

str.center(width, padstr=' ') → new_str

str.chars → an_array

str.chomp(separator=$/) → new_str

str.chomp!(separator=$/) → str or nil

str.chop → new_str

str.chop! → str or nil

chr → string

clear → self

str.codepoints → an_array

concat(*objects) → string

str.count([other_str]+) → integer

str.crypt(salt_str) → new_str

str.delete([other_str]+) → new_str

str.delete!([other_str]+) → str or nil

str.delete_prefix(prefix) → new_str

str.delete_prefix!(prefix) → self or nil

str.delete_suffix(suffix) → new_str

str.delete_suffix!(suffix) → self or nil

downcase(*options) → string

downcase!(*options) → self or nil

dump → string

str.each_byte {|integer| block } → str

str.each_byte → an_enumerator

str.each_char {|cstr| block } → str

str.each_char → an_enumerator

str.each_codepoint {|integer| block } → str

str.each_codepoint → an_enumerator

str.each_grapheme_cluster {|cstr| block } → str

str.each_grapheme_cluster → an_enumerator

str.each_line(separator=$/, chomp: false) {|substr| block } → str

str.each_line(separator=$/, chomp: false) → an_enumerator

empty? → true or false

Methods for a Frozen/Unfrozen `String`