Prism::Source

Class

This represents a source of Ruby code that has been parsed. It is used in conjunction with locations to allow them to resolve line numbers and source ranges.

Attributes

source

Read

The source code that this source object represents.

start_line

Read

The line number where this source starts.

offsets

Read

The list of newline byte offsets in the source code.

Class Methods

for(source, start_line = 1, offsets = [])

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 12
def self.for(source, start_line = 1, offsets = [])
  if source.ascii_only?
    ASCIISource.new(source, start_line, offsets)
  elsif source.encoding == Encoding::BINARY
    source.force_encoding(Encoding::UTF_8)

    if source.valid_encoding?
      new(source, start_line, offsets)
    else
      # This is an extremely niche use case where the file is marked as
      # binary, contains multi-byte characters, and those characters are not
      # valid UTF-8. In this case we'll mark it as binary and fall back to
      # treating everything as a single-byte character. This _may_ cause
      # problems when asking for code units, but it appears to be the
      # cleanest solution at the moment.
      source.force_encoding(Encoding::BINARY)
      ASCIISource.new(source, start_line, offsets)
    end
  else
    new(source, start_line, offsets)
  end
end

Create a new source object with the given source code. This method should be used instead of ‘new` and it will return either a `Source` or a specialized and more performant `ASCIISource` if no multibyte characters are present in the source code.

new(source, start_line = 1, offsets = [])

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 45
def initialize(source, start_line = 1, offsets = [])
  @source = source
  @start_line = start_line # set after parsing is done
  @offsets = offsets # set after parsing is done
end

Create a new source object with the given source code.

Instance Methods

character_column(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 97
def character_column(byte_offset)
  character_offset(byte_offset) - character_offset(line_start(byte_offset))
end

Return the column number in characters for the given byte offset.

character_offset(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 92
def character_offset(byte_offset)
  (source.byteslice(0, byte_offset) or raise).length
end

Return the character offset for the given byte offset.

code_units_cache(encoding)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 125
def code_units_cache(encoding)
  CodeUnitsCache.new(source, encoding)
end

Generate a cache that targets a specific encoding for calculating code unit offsets.

code_units_column(byte_offset, encoding)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 131
def code_units_column(byte_offset, encoding)
  code_units_offset(byte_offset, encoding) - code_units_offset(line_start(byte_offset), encoding)
end

Returns the column number in code units for the given encoding for the given byte offset.

code_units_offset(byte_offset, encoding)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 113
def code_units_offset(byte_offset, encoding)
  byteslice = (source.byteslice(0, byte_offset) or raise).encode(encoding, invalid: :replace, undef: :replace)

  if encoding == Encoding::UTF_16LE || encoding == Encoding::UTF_16BE
    byteslice.bytesize / 2
  else
    byteslice.length
  end
end

Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.

This method is tested with UTF-8, UTF-16, and UTF-32. If there is the concept of code units that differs from the number of characters in other encodings, it is not captured here.

We purposefully replace invalid and undefined characters with replacement characters in this conversion. This happens for two reasons. First, it’s possible that the given byte offset will not occur on a character boundary. Second, it’s possible that the source code will contain a character that has no equivalent in the given encoding.

column(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 87
def column(byte_offset)
  byte_offset - line_start(byte_offset)
end

Return the column number for the given byte offset.

encoding

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 53
def encoding
  source.encoding
end

Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.

find_line(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 139
def find_line(byte_offset)
  left = 0
  right = offsets.length - 1

  while left <= right
    mid = left + (right - left) / 2
    return mid if (offset = offsets[mid]) == byte_offset

    if offset < byte_offset
      left = mid + 1
    else
      right = mid - 1
    end
  end

  left - 1
end

Binary search through the offsets to find the line number for the given byte offset.

line(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 70
def line(byte_offset)
  start_line + find_line(byte_offset)
end

Binary search through the offsets to find the line number for the given byte offset.

line_end(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 82
def line_end(byte_offset)
  offsets[find_line(byte_offset) + 1] || source.bytesize
end

Returns the byte offset of the end of the line corresponding to the given byte offset.

line_start(byte_offset)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 76
def line_start(byte_offset)
  offsets[find_line(byte_offset)]
end

Return the byte offset of the start of the line corresponding to the given byte offset.

lines

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 58
def lines
  source.lines
end

Returns the lines of the source code as an array of strings.

slice(byte_offset, length)

lib/prism/parse_result.rb View on GitHub

          
            # File tmp/rubies/ruby-3.4.1/lib/prism/parse_result.rb, line 64
def slice(byte_offset, length)
  source.byteslice(byte_offset, length) or raise
end

Perform a byteslice on the source code using the given byte offset and byte length.