String

Class

A String object has an arbitrary sequence of bytes, typically representing text or binary data. A String object may be created using String::new or as literals.

String objects differ from Symbol objects in that Symbol objects are designed to be used as identifiers, instead of text or data.

You can create a String object explicitly with:

A string literal.
A heredoc literal.

You can convert certain objects to Strings with:

Method String.

Some String methods modify self. Typically, a method whose name ends with ! modifies self and returns self; often, a similarly named method (without the !) returns a new string.

In general, if both bang and non-bang versions of a method exist, the bang method mutates and the non-bang method does not. However, a method without a bang can also mutate, such as String#replace.

Substitution Methods

These methods perform substitutions:

String#sub: One substitution (or none); returns a new string.
String#sub!: One substitution (or none); returns self if any changes, nil otherwise.
String#gsub: Zero or more substitutions; returns a new string.
String#gsub!: Zero or more substitutions; returns self if any changes, nil otherwise.

Each of these methods takes:

A first argument, pattern (String or Regexp), that specifies the substring(s) to be replaced.
Either of the following:
- A second argument, replacement (String or Hash), that determines the replacing string.
- A block that will determine the replacing string.

The examples in this section mostly use the String#sub and String#gsub methods; the principles illustrated apply to all four substitution methods.

Argument pattern

Argument pattern is commonly a regular expression:

s = 'hello'
s.sub(/[aeiou]/, '*') # => "h*llo"
s.gsub(/[aeiou]/, '*') # => "h*ll*"
s.gsub(/[aeiou]/, '')  # => "hll"
s.sub(/ell/, 'al')     # => "halo"
s.gsub(/xyzzy/, '*')   # => "hello"
'THX1138'.gsub(/\d+/, '00') # => "THX00"

When pattern is a string, all its characters are treated as ordinary characters (not as Regexp special characters):

'THX1138'.gsub('\d+', '00') # => "THX1138"

String replacement

If replacement is a string, that string determines the replacing string that is substituted for the matched text.

Each of the examples above uses a simple string as the replacing string.

String replacement may contain back-references to the pattern’s captures:

\n (n is a non-negative integer) refers to $n.
\k<name> refers to the named capture name.

See Regexp for details.

Note that within the string replacement, a character combination such as $& is treated as ordinary text, not as a special match variable. However, you may refer to some special match variables using these combinations:

\& and \0 correspond to $&, which contains the complete matched text.
\' corresponds to $', which contains the string after the match.
` corresponds to $`, which contains the string before the match.
\+ corresponds to $+, which contains the last capture group.

See Regexp for details.

Note that \\ is interpreted as an escape, i.e., a single backslash.

Note also that a string literal consumes backslashes. See String Literals for details about string literals.

A back-reference is typically preceded by an additional backslash. For example, if you want to write a back-reference \& in replacement with a double-quoted string literal, you need to write "..\\&..".

If you want to write a non-back-reference string \& in replacement, you need to first escape the backslash to prevent this method from interpreting it as a back-reference, and then you need to escape the backslashes again to prevent a string literal from consuming them: "..\\\\&..".

You may want to use the block form to avoid excessive backslashes.

Hash replacement

If the argument replacement is a hash, and pattern matches one of its keys, the replacing string is the value for that key:

h = {'foo' => 'bar', 'baz' => 'bat'}
'food'.sub('foo', h) # => "bard"

Note that a symbol key does not match:

h = {foo: 'bar', baz: 'bat'}
'food'.sub('foo', h) # => "d"

Block

In the block form, the current match string is passed to the block; the block’s return value becomes the replacing string:

s = '@'
'1234'.gsub(/\d/) { |match| s.succ! } # => "ABCD"

Special match variables such as $1, $2, $`, $&, and $' are set appropriately.

Whitespace in Strings

In the class String, whitespace is defined as a contiguous sequence of characters consisting of any mixture of the following:

NL (null): "\x00", "\u0000".
HT (horizontal tab): "\x09", "\t".
LF (line feed): "\x0a", "\n".
VT (vertical tab): "\x0b", "\v".
FF (form feed): "\x0c", "\f".
CR (carriage return): "\x0d", "\r".
SP (space): "\x20", " ".

Whitespace is relevant for the following methods:

lstrip, lstrip!: Strip leading whitespace.
rstrip, rstrip!: Strip trailing whitespace.
strip, strip!: Strip leading and trailing whitespace.

What’s Here

First, what’s elsewhere. Class String:

Inherits from the Object class.
Includes the Comparable module.

Here, class String provides methods that are useful for:

Creating a String

::new: Returns a new string.
::try_convert: Returns a new string created from a given object.

Freezing/Unfreezing

+@: Returns a string that is not frozen: self if not frozen; self.dup otherwise.
-@ (aliased as dedup): Returns a string that is frozen: self if already frozen; self.freeze otherwise.
freeze: Freezes self if not already frozen; returns self.

Querying

Counts

bytesize: Returns the count of bytes.
count: Returns the count of substrings matching given strings.
empty?: Returns whether the length of self is zero.
length (aliased as size): Returns the count of characters (not bytes).

Substrings

=~: Returns the index of the first substring that matches a given Regexp or other object; returns nil if no match is found.
byteindex: Returns the byte index of the first occurrence of a given substring.
byterindex: Returns the byte index of the last occurrence of a given substring.
index: Returns the index of the first occurrence of a given substring; returns nil if none found.
rindex: Returns the index of the last occurrence of a given substring; returns nil if none found.
include?: Returns true if the string contains a given substring; false otherwise.
match: Returns a MatchData object if the string matches a given Regexp; nil otherwise.
match?: Returns true if the string matches a given Regexp; false otherwise.
start_with?: Returns true if the string begins with any of the given substrings.
end_with?: Returns true if the string ends with any of the given substrings.

Encodings

encoding: Returns the Encoding object that represents the encoding of the string.
unicode_normalized?: Returns true if the string is in Unicode normalized form; false otherwise.
valid_encoding?: Returns true if the string contains only characters that are valid for its encoding.
ascii_only?: Returns true if the string has only ASCII characters; false otherwise.

Other

sum: Returns a basic checksum for the string: the sum of each byte.
hash: Returns the integer hash code.

Comparing

== (aliased as ===): Returns true if a given other string has the same content as self.
eql?: Returns true if the content is the same as the given other string.
<=>: Returns -1, 0, or 1 as a given other string is smaller than, equal to, or larger than self.
casecmp: Ignoring case, returns -1, 0, or 1 as self is smaller than, equal to, or larger than a given other string.
casecmp?: Ignoring case, returns whether a given other string is equal to self.

Modifying

Each of these methods modifies self.

Insertion

insert: Returns self with a given string inserted at a specified offset.
<<: Returns self concatenated with a given string or integer.
append_as_bytes: Returns self concatenated with strings without performing any encoding validation or conversion.
prepend: Prefixes to self the concatenation of given other strings.

Substitution

bytesplice: Replaces bytes of self with bytes from a given string; returns self.
sub!: Replaces the first substring that matches a given pattern with a given replacement string; returns self if any changes, nil otherwise.
gsub!: Replaces each substring that matches a given pattern with a given replacement string; returns self if any changes, nil otherwise.
succ! (aliased as next!): Returns self modified to become its own successor.
replace: Returns self with its entire content replaced by a given string.
reverse!: Returns self with its characters in reverse order.
setbyte: Sets the byte at a given integer offset to a given value; returns the argument.
tr!: Replaces specified characters in self with specified replacement characters; returns self if any changes, nil otherwise.
tr_s!: Replaces specified characters in self with specified replacement characters, removing duplicates from the substrings that were modified; returns self if any changes, nil otherwise.

Casing

capitalize!: Upcases the initial character and downcases all others; returns self if any changes, nil otherwise.
downcase!: Downcases all characters; returns self if any changes, nil otherwise.
upcase!: Upcases all characters; returns self if any changes, nil otherwise.
swapcase!: Upcases each downcase character and downcases each upcase character; returns self if any changes, nil otherwise.

Encoding

encode!: Returns self with all characters transcoded from one encoding to another.
unicode_normalize!: Unicode-normalizes self; returns self.
scrub!: Replaces each invalid byte with a given character; returns self.
force_encoding: Changes the encoding to a given encoding; returns self.

Deletion

clear: Removes all content, so that self is empty; returns self.
slice!, []=: Removes a substring determined by a given index, start/length, range, regexp, or substring.
squeeze!: Removes contiguous duplicate characters; returns self.
delete!: Removes characters as determined by the intersection of substring arguments.
delete_prefix!: Removes leading prefix; returns self if any changes, nil otherwise.
delete_suffix!: Removes trailing suffix; returns self if any changes, nil otherwise.
lstrip!: Removes leading whitespace; returns self if any changes, nil otherwise.
rstrip!: Removes trailing whitespace; returns self if any changes, nil otherwise.
strip!: Removes leading and trailing whitespace; returns self if any changes, nil otherwise.
chomp!: Removes the trailing record separator, if found; returns self if any changes, nil otherwise.
chop!: Removes trailing newline characters if found; otherwise removes the last character; returns self if any changes, nil otherwise.

Converting to New String

Each of these methods returns a new String based on self, often just a modified copy of self.

Extension

*: Returns the concatenation of multiple copies of self.
+: Returns the concatenation of self and a given other string.
center: Returns a copy of self, centered by specified padding.
concat: Returns the concatenation of self with given other strings.
ljust: Returns a copy of self of a given length, right-padded with a given other string.
rjust: Returns a copy of self of a given length, left-padded with a given other string.

Encoding

b: Returns a copy of self with ASCII-8BIT encoding.
scrub: Returns a copy of self with each invalid byte replaced with a given character.
unicode_normalize: Returns a copy of self with each character Unicode-normalized.
encode: Returns a copy of self with all characters transcoded from one encoding to another.

Substitution

dump: Returns a printable version of self, enclosed in double-quotes.
undump: Inverse of dump; returns a copy of self with changes of the kinds made by dump “undone.”
sub: Returns a copy of self with the first substring matching a given pattern replaced with a given replacement string.
gsub: Returns a copy of self with each substring that matches a given pattern replaced with a given replacement string.
succ (aliased as next): Returns the string that is the successor to self.
reverse: Returns a copy of self with its characters in reverse order.
tr: Returns a copy of self with specified characters replaced with specified replacement characters.
tr_s: Returns a copy of self with specified characters replaced with specified replacement characters, removing duplicates from the substrings that were modified.
%: Returns the string resulting from formatting a given object into self.

Casing

capitalize: Returns a copy of self with the first character upcased and all other characters downcased.
downcase: Returns a copy of self with all characters downcased.
upcase: Returns a copy of self with all characters upcased.
swapcase: Returns a copy of self with all upcase characters downcased and all downcase characters upcased.

Deletion

delete: Returns a copy of self with characters removed.
delete_prefix: Returns a copy of self with a given prefix removed.
delete_suffix: Returns a copy of self with a given suffix removed.
lstrip: Returns a copy of self with leading whitespace removed.
rstrip: Returns a copy of self with trailing whitespace removed.
strip: Returns a copy of self with leading and trailing whitespace removed.
chomp: Returns a copy of self with a trailing record separator removed, if found.
chop: Returns a copy of self with trailing newline characters or the last character removed.
squeeze: Returns a copy of self with contiguous duplicate characters removed.
[] (aliased as slice): Returns a substring determined by a given index, start/length, range, regexp, or string.
byteslice: Returns a substring determined by a given index, start/length, or range.
chr: Returns the first character.

Duplication

to_s (aliased as to_str): If self is a subclass of String, returns self copied into a String; otherwise, returns self.

Converting to Non-String

Each of these methods converts the contents of self to a non-String.

Characters, Bytes, and Clusters

bytes: Returns an array of the bytes in self.
chars: Returns an array of the characters in self.
codepoints: Returns an array of the integer ordinals in self.
getbyte: Returns the integer byte at the given index in self.
grapheme_clusters: Returns an array of the grapheme clusters in self.

Splitting

lines: Returns an array of the lines in self, as determined by a given record separator.
partition: Returns a 3-element array determined by the first substring that matches a given substring or regexp.
rpartition: Returns a 3-element array determined by the last substring that matches a given substring or regexp.
split: Returns an array of substrings determined by a given delimiter – regexp or string – or, if a block is given, passes those substrings to the block.

Matching

scan: Returns an array of substrings matching a given regexp or string, or, if a block is given, passes each matching substring to the block.
unpack: Returns an array of substrings extracted from self according to a given format.
unpack1: Returns the first substring extracted from self according to a given format.

Numerics

hex: Returns the integer value of the leading characters, interpreted as hexadecimal digits.
oct: Returns the integer value of the leading characters, interpreted as octal digits.
ord: Returns the integer ordinal of the first character in self.
to_c: Returns the complex value of leading characters, interpreted as a complex number.
to_i: Returns the integer value of leading characters, interpreted as an integer.
to_f: Returns the floating-point value of leading characters, interpreted as a floating-point number.
to_r: Returns the rational value of leading characters, interpreted as a rational.

Strings and Symbols

inspect: Returns a copy of self, enclosed in double quotes, with special characters escaped.
intern (aliased as to_sym): Returns the symbol corresponding to self.

Iterating

each_byte: Calls the given block with each successive byte in self.
each_char: Calls the given block with each successive character in self.
each_codepoint: Calls the given block with each successive integer codepoint in self.
each_grapheme_cluster: Calls the given block with each successive grapheme cluster in self.
each_line: Calls the given block with each successive line in self, as determined by a given record separator.
upto: Calls the given block with each string value returned by successive calls to succ.

Class Methods

json_create(o)

ext/json/lib/json/add/string.rb View on GitHub

          # File tmp/rubies/ruby-4.0.0/ext/json/lib/json/add/string.rb, line 11
def self.json_create(object)
  object["raw"].pack("C*")
end

Raw Strings are JSON Objects (the raw bytes are stored in an array for the key “raw”). The Ruby String can be created by this class method.

json_create(o)

String.new(string = ''.encode(Encoding::ASCII_8BIT) , **options) → new_string

String.try_convert(object) → object, new_string, or nil

self << object → self

self <=> other → -1, 0, 1, or nil

self =~ object → integer or nil

self == object → true or false

===(p1)

-self → frozen_string

self[index] → new_string or nil

self[start, length] → new_string or nil

self[range] → new_string or nil

self[regexp, capture = 0] → new_string or nil

self[substring] → new_string or nil

self[index] = other_string → new_string

self[start, length] = other_string → new_string

self[range] = other_string → new_string

self[regexp, capture = 0] = other_string → new_string

self[substring] = other_string → new_string

self * n → new_string

self % object → new_string

self + other_string → new_string

+string → new_string or self

append_as_bytes(*objects) → self

ascii_only? → true or false

b → new_string

byteindex(object, offset = 0) → integer or nil

byterindex(object, offset = self.bytesize) → integer or nil

bytes → array_of_bytes

bytesize → integer

byteslice(offset, length = 1) → string or nil

byteslice(range) → string or nil

bytesplice(offset, length, str) → self

bytesplice(offset, length, str, str_offset, str_length) → self

bytesplice(range, str) → self

bytesplice(range, str, str_range) → self

capitalize(mapping = :ascii) → new_string

capitalize!(mapping = :ascii) → self or nil

casecmp(other_string) → -1, 0, 1, or nil

casecmp?(other_string) → true, false, or nil

center(size, pad_string = ' ') → new_string

chars → array_of_characters

chomp(line_sep = $/) → new_string

chomp!(line_sep = $/) → self or nil

chop → new_string

chop! → self or nil

chr → string

clear → self

codepoints → array_of_integers

concat(*objects) → string

count(*selectors) → integer

crypt(salt_str) → new_string

-self → frozen_string

delete(*selectors) → new_string

delete!(*selectors) → self or nil

delete_prefix(prefix) → new_string

delete_prefix!(prefix) → self or nil

delete_suffix(suffix) → new_string

delete_suffix!(suffix) → self or nil

downcase(mapping = :ascii) → new_string

downcase!(mapping) → self or nil

dump → new_string

each_byte {|byte| ... } → self

each_byte → enumerator

each_char {|char| ... } → self

each_char → enumerator

each_codepoint {|codepoint| ... } → self

each_codepoint → enumerator

each_grapheme_cluster {|grapheme_cluster| ... } → self

each_grapheme_cluster → enumerator

each_line(record_separator = $/, chomp: false) {|substring| ... } → self

each_line(record_separator = $/, chomp: false) → enumerator

empty? → true or false

encode(dst_encoding = Encoding.default_internal, **enc_opts) → string

encode(dst_encoding, src_encoding, **enc_opts) → string

encode!(dst_encoding = Encoding.default_internal, **enc_opts) → self

encode!(dst_encoding, src_encoding, **enc_opts) → self

encoding → encoding

end_with?(*strings) → true or false

eql?(object) → true or false