CSV
In a Hurry?
If you are familiar with CSV data and have a particular task in mind, you may want to go directly to the:
Otherwise, read on here, about the API: classes, methods, and constants.
CSV Data
CSV (comma-separated values) data is a text representation of a table:
-
A row separator delimits table rows. A common row separator is the newline character
"\n"
. -
A column separator delimits fields in a row. A common column separator is the comma character
","
.
This CSV String, with row separator "\n"
and column separator ","
, has three rows and two columns:
"foo,0\nbar,1\nbaz,2\n"
Despite the name CSV, a CSV representation can use different separators.
For more about tables, see the Wikipedia article “Table (information)”, especially its section “Simple table”
Class CSV
Class
CSV provides methods for:
-
Parsing CSV data from a String object, a File (via its file path), or an IO object.
-
Generating CSV data to a String object.
To make CSV available:
require 'csv'
All examples here assume that this has been done.
Keeping It Simple
A CSV object has dozens of instance methods that offer fine-grained control of parsing and generating CSV data. For many needs, though, simpler approaches will do.
This section summarizes the singleton methods in CSV that allow you to parse and generate without explicitly creating CSV objects. For details, follow the links.
Simple Parsing
Parsing methods commonly return either of:
-
An Array of Arrays of Strings:
-
The outer Array is the entire “table”.
-
Each inner Array is a row.
-
Each String is a field.
-
-
A
CSV::Table
object. For details, see CSV with Headers.
Parsing a String
The input to be parsed can be a string:
string = "foo,0\nbar,1\nbaz,2\n"
Method CSV.parse
returns the entire CSV data:
CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Method CSV.parse_line
returns only the first row:
CSV.parse_line(string) # => ["foo", "0"]
CSV extends class String with instance method String#parse_csv, which also returns only the first row:
string.parse_csv # => ["foo", "0"]
Parsing Via a File Path
The input to be parsed can be in a file:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
Method CSV.read
returns the entire CSV data:
CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Method CSV.foreach
iterates, passing each row to the given block:
CSV.foreach(path) do |row| p row end
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
Method CSV.table
returns the entire CSV data as a CSV::Table
object:
CSV.table(path) # => #<CSV::Table mode:col_or_row row_count:3>
Parsing from an Open IO Stream
The input to be parsed can be in an open IO stream:
Method CSV.read
returns the entire CSV data:
File.open(path) do |file| CSV.read(file) end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
As does method CSV.parse
:
File.open(path) do |file| CSV.parse(file) end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Method CSV.parse_line
returns only the first row:
File.open(path) do |file| CSV.parse_line(file) end # => ["foo", "0"]
Method CSV.foreach
iterates, passing each row to the given block:
File.open(path) do |file| CSV.foreach(file) do |row| p row end end
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
Method CSV.table
returns the entire CSV data as a CSV::Table
object:
File.open(path) do |file| CSV.table(file) end # => #<CSV::Table mode:col_or_row row_count:3>
Simple Generating
Method CSV.generate
returns a String; this example uses method CSV#<<
to append the rows that are to be generated:
output_string = CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end output_string # => "foo,0\nbar,1\nbaz,2\n"
Method CSV.generate_line
returns a String containing the single row constructed from an Array:
CSV.generate_line(['foo', '0']) # => "foo,0\n"
CSV extends class Array with instance method Array#to_csv
, which forms an Array into a String:
['foo', '0'].to_csv # => "foo,0\n"
“Filtering” CSV
Method CSV.filter
provides a Unix-style filter for CSV data. The input data is processed to form the output data:
in_string = "foo,0\nbar,1\nbaz,2\n" out_string = '' CSV.filter(in_string, out_string) do |row| row[0] = row[0].upcase row[1] *= 4 end out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
CSV Objects
There are three ways to create a CSV object:
-
Method
CSV.new
returns a new CSV object. -
Method
CSV.instance
returns a new or cached CSV object. -
Method CSV() also returns a new or cached CSV object.
Instance Methods
CSV has three groups of instance methods:
-
Its own internally defined instance methods.
-
Methods included by module
Enumerable
. -
Methods delegated to class
IO
. See below.
Delegated Methods
For convenience, a CSV
object will delegate to many methods in class IO
. (A few have wrapper “guard code” in CSV.) You may call:
-
IO#string
-
IO#truncate
Options
The default values for options are:
DEFAULT_OPTIONS = { # For both parsing and generating. col_sep: ",", row_sep: :auto, quote_char: '"', # For parsing. field_size_limit: nil, converters: nil, unconverted_fields: nil, headers: false, return_headers: false, header_converters: nil, skip_blanks: false, skip_lines: nil, liberal_parsing: false, nil_value: nil, empty_value: "", strip: false, # For generating. write_headers: nil, quote_empty: true, force_quotes: false, write_converters: nil, write_nil_value: nil, write_empty_value: "", }
Options for Parsing
Options for parsing, described in detail below, include:
-
row_sep
: Specifies the row separator; used to delimit rows. -
col_sep
: Specifies the column separator; used to delimit fields. -
quote_char
: Specifies the quote character; used to quote fields. -
field_size_limit
: Specifies the maximum field size + 1 allowed. Deprecated since 3.2.3. Usemax_field_size
instead. -
max_field_size
: Specifies the maximum field size allowed. -
converters
: Specifies the field converters to be used. -
unconverted_fields
: Specifies whether unconverted fields are to be available. -
headers
: Specifies whether data contains headers, or specifies the headers themselves. -
return_headers
: Specifies whether headers are to be returned. -
header_converters
: Specifies the header converters to be used. -
skip_blanks
: Specifies whether blanks lines are to be ignored. -
skip_lines
: Specifies how comments lines are to be recognized. -
strip
: Specifies whether leading and trailing whitespace are to be stripped from fields. This must be compatible withcol_sep
; if it is not, then anArgumentError
exception will be raised. -
liberal_parsing
: Specifies whether CSV should attempt to parse non-compliant data. -
nil_value
: Specifies the object that is to be substituted for each null (no-text) field. -
empty_value
: Specifies the object that is to be substituted for each empty field.
Option row_sep
Specifies the row separator, a String or the Symbol :auto
(see below), to be used for both parsing and generating.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto
When row_sep
is a String, that String becomes the row separator. The String
will be transcoded into the data’s Encoding
before use.
Using "\n"
:
row_sep = "\n" str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using |
(pipe):
row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0|bar,1|baz,2|" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using --
(two hyphens):
row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0--bar,1--baz,2--" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ''
(empty string):
row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0bar,1baz,2" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0bar", "1baz", "2"]]
When row_sep
is the Symbol :auto
(the default), generating uses "\n"
as the row separator:
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n"
Parsing, on the other hand, invokes auto-discovery of the row separator.
Auto-discovery reads ahead in the data looking for the next \r\n
, \n
, or \r
sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.
Example:
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The default $INPUT_RECORD_SEPARATOR
($/
) is used if any of the following is true:
-
None of those sequences is found.
-
Data is
ARGF
,STDIN
,STDOUT
, orSTDERR
. -
The stream is only available for output.
Obviously, discovery takes a little time. Set
manually if speed is important. Also note that IO
objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.
Raises an exception if the given value is not String-convertible:
row_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate(ary, row_sep: row_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, row_sep: row_sep)
Option col_sep
Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)
Using the default (comma):
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using :
(colon):
col_sep = ':' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo:0\nbar:1\nbaz:2\n" ary = CSV.parse(str, col_sep: col_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ::
(two colons):
col_sep = '::' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo::0\nbar::1\nbaz::2\n" ary = CSV.parse(str, col_sep: col_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ''
(empty string):
col_sep = '' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo0\nbar1\nbaz2\n"
Raises an exception if parsing with the empty String:
col_sep = '' # Raises ArgumentError (:col_sep must be 1 or more characters: "") CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep)
Raises an exception if the given value is not String-convertible:
col_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate(line, col_sep: col_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, col_sep: col_sep)
Option quote_char
Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String
will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (double quote)
This is useful for an application that incorrectly uses '
(single-quote) to quote fields, instead of the correct "
(double-quote).
Using the default (double quote):
str = CSV.generate do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str # => "foo,0\n'bar',1\n\"\"\"baz\"\"\",2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
Using '
(single-quote):
quote_char = "'" str = CSV.generate(quote_char: quote_char) do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str # => "foo,0\n'''bar''',1\n\"baz\",2\n" ary = CSV.parse(str, quote_char: quote_char) ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
Raises an exception if the String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: 'xx')
Raises an exception if the value is not a String:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: :foo)
Option field_size_limit
Specifies the Integer field size limit.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:field_size_limit) # => nil
This is a maximum size CSV
will read ahead looking for the closing quote for a field. (In truth, it reads to the first line ending beyond this size.) If a quote cannot be found within the limit CSV
will raise a MalformedCSVError
, assuming the data is faulty. You can use this limit to prevent what are effectively DoS attacks on the parser. However, this limit can cause a legitimate parse to fail; therefore the default value is nil
(no limit).
For the examples in this section:
str = <<~EOT "a","b" " 2345 ","" EOT str # => "\"a\",\"b\"\n\"\n2345\n\",\"\"\n"
Using the default nil
:
ary = CSV.parse(str) ary # => [["a", "b"], ["\n2345\n", ""]]
Using 50
:
field_size_limit = 50 ary = CSV.parse(str, field_size_limit: field_size_limit) ary # => [["a", "b"], ["\n2345\n", ""]]
Raises an exception if a field is too long:
big_str = "123456789\n" * 1024 # Raises CSV::MalformedCSVError (Field size exceeded in line 1.) CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048)
Option converters
Specifies converters to be used in parsing fields. See Field Converters
Default value:
CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil
The value may be a field converter name (see Stored Converters):
str = '1,2,3' # Without a converter array = CSV.parse_line(str) array # => ["1", "2", "3"] # With built-in converter :integer array = CSV.parse_line(str, converters: :integer) array # => [1, 2, 3]
The value may be a converter list (see Converter Lists):
str = '1,3.14159' # Without converters array = CSV.parse_line(str) array # => ["1", "3.14159"] # With built-in converters array = CSV.parse_line(str, converters: [:integer, :float]) array # => [1, 3.14159]
The value may be a Proc custom converter: (see Custom Field Converters):
str = ' foo , bar , baz ' # Without a converter array = CSV.parse_line(str) array # => [" foo ", " bar ", " baz "] # With a custom converter array = CSV.parse_line(str, converters: proc {|field| field.strip }) array # => ["foo", "bar", "baz"]
See also Custom Field Converters
Raises an exception if the converter is not a converter name or a Proc:
str = 'foo,0' # Raises NoMethodError (undefined method `arity' for nil:NilClass) CSV.parse(str, converters: :foo)
Option unconverted_fields
Specifies the boolean that determines whether unconverted field values are to be available.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:unconverted_fields) # => nil
The unconverted field values are those found in the source data, prior to any conversions performed via option converters
.
When option unconverted_fields
is true
, each returned row (Array or CSV::Row) has an added method, unconverted_fields
, that returns the unconverted field values:
str = <<-EOT foo,0 bar,1 baz,2 EOT # Without unconverted_fields csv = CSV.parse(str, converters: :integer) csv # => [["foo", 0], ["bar", 1], ["baz", 2]] csv.first.respond_to?(:unconverted_fields) # => false # With unconverted_fields csv = CSV.parse(str, converters: :integer, unconverted_fields: true) csv # => [["foo", 0], ["bar", 1], ["baz", 2]] csv.first.respond_to?(:unconverted_fields) # => true csv.first.unconverted_fields # => ["foo", "0"]
Option headers
Specifies a boolean, Symbol, Array, or String to be used to define column headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:headers) # => false
Without headers
:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str) csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\""> csv.headers # => nil csv.shift # => ["Name", "Count"]
If set to true
or the Symbol :first_row
, the first row of the data is treated as a row of headers:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: true) csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:2 col_sep:"," row_sep:"\n" quote_char:"\"" headers:["Name", "Count"]> csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
If set to an Array, the Array elements are treated as headers:
str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: ['Name', 'Count']) csv csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
If set to a String str
, method CSV::parse_line(str, options)
is called with the current options
, and the returned Array is treated as headers:
str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: 'Name,Count') csv csv.headers # => ["Name", "Count"] csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
Option return_headers
Specifies the boolean that determines whether method shift
returns or ignores the header row.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:return_headers) # => false
Examples:
str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT # Without return_headers first row is str. csv = CSV.new(str, headers: true) csv.shift # => #<CSV::Row "Name":"foo" "Count":"0"> # With return_headers first row is headers. csv = CSV.new(str, headers: true, return_headers: true) csv.shift # => #<CSV::Row "Name":"Name" "Count":"Count">
Option header_converters
Specifies converters to be used in parsing headers. See Header Converters
Default value:
CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil
Identical in functionality to option converters except that:
-
The converters apply only to the header row.
-
The built-in header converters are
:downcase
and:symbol
.
This section assumes prior execution of:
str = <<-EOT Name,Value foo,0 bar,1 baz,2 EOT # With no header converter table = CSV.parse(str, headers: true) table.headers # => ["Name", "Value"]
The value may be a header converter name (see Stored Converters):
table = CSV.parse(str, headers: true, header_converters: :downcase) table.headers # => ["name", "value"]
The value may be a converter list (see Converter Lists):
header_converters = [:downcase, :symbol] table = CSV.parse(str, headers: true, header_converters: header_converters) table.headers # => [:name, :value]
The value may be a Proc custom converter (see Custom Header Converters):
upcase_converter = proc {|field| field.upcase } table = CSV.parse(str, headers: true, header_converters: upcase_converter) table.headers # => ["NAME", "VALUE"]
See also Custom Header Converters
Option skip_blanks
Specifies a boolean that determines whether blank lines in the input will be ignored; a line that contains a column separator is not considered to be blank.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_blanks) # => false
See also option skiplines.
For examples in this section:
str = <<-EOT foo,0 bar,1 baz,2 , EOT
Using the default, false
:
ary = CSV.parse(str) ary # => [["foo", "0"], [], ["bar", "1"], ["baz", "2"], [], [nil, nil]]
Using true
:
ary = CSV.parse(str, skip_blanks: true) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Using a truthy value:
ary = CSV.parse(str, skip_blanks: :foo) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Option skip_lines
Specifies an object to use in identifying comment lines in the input that are to be ignored:
-
If a Regexp, ignores lines that match it.
-
If a String, converts it to a Regexp, ignores lines that match it.
-
If
nil
, no lines are considered to be comments.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_lines) # => nil
For examples in this section:
str = <<-EOT # Comment foo,0 bar,1 baz,2 # Another comment EOT str # => "# Comment\nfoo,0\nbar,1\nbaz,2\n# Another comment\n"
Using the default, nil
:
ary = CSV.parse(str) ary # => [["# Comment"], ["foo", "0"], ["bar", "1"], ["baz", "2"], ["# Another comment"]]
Using a Regexp:
ary = CSV.parse(str, skip_lines: /^#/) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using a String:
ary = CSV.parse(str, skip_lines: '#') ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Raises an exception if given an object that is not a Regexp, a String, or nil
:
# Raises ArgumentError (:skip_lines has to respond to #match: 0) CSV.parse(str, skip_lines: 0)
Option strip
Specifies the boolean value that determines whether whitespace is stripped from each input field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:strip) # => false
With default value false
:
ary = CSV.parse_line(' a , b ') ary # => [" a ", " b "]
With value true
:
ary = CSV.parse_line(' a , b ', strip: true) ary # => ["a", "b"]
Option liberal_parsing
Specifies the boolean value that determines whether CSV
will attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing) # => false
For examples in this section:
str = 'is,this "three, or four",fields'
Without liberal_parsing
:
# Raises CSV::MalformedCSVError (Illegal quoting in str 1.) CSV.parse_line(str)
With liberal_parsing
:
ary = CSV.parse_line(str, liberal_parsing: true) ary # => ["is", "this \"three", " or four\"", "fields"]
Option nil_value
Specifies the object that is to be substituted for each null (no-text) field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:nil_value) # => nil
With the default, nil
:
CSV.parse_line('a,,b,,c') # => ["a", nil, "b", nil, "c"]
With a different object:
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
Option empty_value
Specifies the object that is to be substituted for each field that has an empty String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:empty_value) # => "" (empty string)
With the default, ""
:
CSV.parse_line('a,"",b,"",c') # => ["a", "", "b", "", "c"]
With a different object:
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
Options for Generating
Options for generating, described in detail below, include:
-
row_sep
: Specifies the row separator; used to delimit rows. -
col_sep
: Specifies the column separator; used to delimit fields. -
quote_char
: Specifies the quote character; used to quote fields. -
write_headers
: Specifies whether headers are to be written. -
force_quotes
: Specifies whether each output field is to be quoted. -
quote_empty
: Specifies whether each empty output field is to be quoted. -
write_converters
: Specifies the field converters to be used in writing. -
write_nil_value
: Specifies the object that is to be substituted for eachnil
-valued field. -
write_empty_value
: Specifies the object that is to be substituted for each empty field.
Option row_sep
Specifies the row separator, a String or the Symbol :auto
(see below), to be used for both parsing and generating.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto
When row_sep
is a String, that String becomes the row separator. The String
will be transcoded into the data’s Encoding
before use.
Using "\n"
:
row_sep = "\n" str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using |
(pipe):
row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0|bar,1|baz,2|" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using --
(two hyphens):
row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0--bar,1--baz,2--" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ''
(empty string):
row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0bar,1baz,2" ary = CSV.parse(str, row_sep: row_sep) ary # => [["foo", "0bar", "1baz", "2"]]
When row_sep
is the Symbol :auto
(the default), generating uses "\n"
as the row separator:
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n"
Parsing, on the other hand, invokes auto-discovery of the row separator.
Auto-discovery reads ahead in the data looking for the next \r\n
, \n
, or \r
sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.
Example:
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The default $INPUT_RECORD_SEPARATOR
($/
) is used if any of the following is true:
-
None of those sequences is found.
-
Data is
ARGF
,STDIN
,STDOUT
, orSTDERR
. -
The stream is only available for output.
Obviously, discovery takes a little time. Set
manually if speed is important. Also note that IO
objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.
Raises an exception if the given value is not String-convertible:
row_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate(ary, row_sep: row_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, row_sep: row_sep)
Option col_sep
Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)
Using the default (comma):
str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using :
(colon):
col_sep = ':' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo:0\nbar:1\nbaz:2\n" ary = CSV.parse(str, col_sep: col_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ::
(two colons):
col_sep = '::' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo::0\nbar::1\nbaz::2\n" ary = CSV.parse(str, col_sep: col_sep) ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using ''
(empty string):
col_sep = '' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str # => "foo0\nbar1\nbaz2\n"
Raises an exception if parsing with the empty String:
col_sep = '' # Raises ArgumentError (:col_sep must be 1 or more characters: "") CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep)
Raises an exception if the given value is not String-convertible:
col_sep = BasicObject.new # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.generate(line, col_sep: col_sep) # Raises NoMethodError (undefined method `to_s' for #<BasicObject:>) CSV.parse(str, col_sep: col_sep)
Option quote_char
Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String
will be transcoded into the data’s Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (double quote)
This is useful for an application that incorrectly uses '
(single-quote) to quote fields, instead of the correct "
(double-quote).
Using the default (double quote):
str = CSV.generate do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str # => "foo,0\n'bar',1\n\"\"\"baz\"\"\",2\n" ary = CSV.parse(str) ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
Using '
(single-quote):
quote_char = "'" str = CSV.generate(quote_char: quote_char) do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str # => "foo,0\n'''bar''',1\n\"baz\",2\n" ary = CSV.parse(str, quote_char: quote_char) ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
Raises an exception if the String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: 'xx')
Raises an exception if the value is not a String:
# Raises ArgumentError (:quote_char has to be nil or a single character String) CSV.new('', quote_char: :foo)
Option write_headers
Specifies the boolean that determines whether a header row is included in the output; ignored if there are no headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_headers) # => nil
Without write_headers
:
file_path = 't.csv' CSV.open(file_path,'w', :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end # => ["foo", "0"]
With write_headers
“:
CSV.open(file_path,'w', :write_headers=> true, :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end # => ["Name", "Value"]
Option force_quotes
Specifies the boolean that determines whether each output field is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:force_quotes) # => false
For examples in this section:
ary = ['foo', 0, nil]
Using the default, false
:
str = CSV.generate_line(ary) str # => "foo,0,\n"
Using true
:
str = CSV.generate_line(ary, force_quotes: true) str # => "\"foo\",\"0\",\"\"\n"
Option quote_empty
Specifies the boolean that determines whether an empty value is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_empty) # => true
With the default true
:
CSV.generate_line(['"', ""]) # => "\"\"\"\",\"\"\n"
With false
:
CSV.generate_line(['"', ""], quote_empty: false) # => "\"\"\"\",\n"
Option write_converters
Specifies converters to be used in generating fields. See Write Converters
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil
With no write converter:
str = CSV.generate_line(["\na\n", "\tb\t", " c "]) str # => "\"\na\n\",\tb\t, c \n"
With a write converter:
strip_converter = proc {|field| field.strip } str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter) str # => "a,b,c\n"
With two write converters (called in order):
upcase_converter = proc {|field| field.upcase } downcase_converter = proc {|field| field.downcase } write_converters = [upcase_converter, downcase_converter] str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters) str # => "a,b,c\n"
See also Write Converters
Raises an exception if the converter returns a value that is neither nil
nor String-convertible:
bad_converter = proc {|field| BasicObject.new } # Raises NoMethodError (undefined method `is_a?' for #<BasicObject:>) CSV.generate_line(['a', 'b', 'c'], write_converters: bad_converter)#
Option write_nil_value
Specifies the object that is to be substituted for each nil
-valued field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_nil_value) # => nil
Without the option:
str = CSV.generate_line(['a', nil, 'c', nil]) str # => "a,,c,\n"
With the option:
str = CSV.generate_line(['a', nil, 'c', nil], write_nil_value: "x") str # => "a,x,c,x\n"
Option write_empty_value
Specifies the object that is to be substituted for each field that has an empty String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_empty_value) # => ""
Without the option:
str = CSV.generate_line(['a', '', 'c', '']) str # => "a,\"\",c,\"\"\n"
With the option:
str = CSV.generate_line(['a', '', 'c', ''], write_empty_value: "x") str # => "a,x,c,x\n"
CSV with Headers
CSV
allows to specify column names of CSV
file, whether they are in data, or provided separately. If headers are specified, reading methods return an instance of CSV::Table
, consisting of CSV::Row
.
# Headers are part of data data = CSV.parse(<<~ROWS, headers: true) Name,Department,Salary Bob,Engineering,1000 Jane,Sales,2000 John,Management,5000 ROWS data.class #=> CSV::Table data.first #=> #<CSV::Row "Name":"Bob" "Department":"Engineering" "Salary":"1000"> data.first.to_h #=> {"Name"=>"Bob", "Department"=>"Engineering", "Salary"=>"1000"} # Headers provided by developer data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary]) data.first #=> #<CSV::Row name:"Bob" department:"Engineering" salary:"1000">
Converters
By default, each value (field or header) parsed by CSV is formed into a String. You can use a field converter or header converter to intercept and modify the parsed values:
-
See Field Converters.
-
See Header Converters.
Also by default, each value to be written during generation is written ‘as-is’. You can use a write converter to modify values before writing.
-
See Write Converters.
Specifying Converters
You can specify converters for parsing or generating in the options
argument to various CSV methods:
-
Option
converters
for converting parsed field values. -
Option
header_converters
for converting parsed header values. -
Option
write_converters
for converting values to be written (generated).
There are three forms for specifying converters:
-
A converter proc: executable code to be used for conversion.
-
A converter name: the name of a stored converter.
-
A converter list: an array of converter procs, converter names, and converter lists.
Converter Procs
This converter proc, strip_converter
, accepts a value field
and returns field.strip
:
strip_converter = proc {|field| field.strip }
In this call to CSV.parse
, the keyword argument converters: string_converter
specifies that:
-
Proc
string_converter
is to be called for each parsed field. -
The converter’s return value is to replace the
field
value.
Example:
string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
A converter proc can receive a second argument, field_info
, that contains details about the field. This modified strip_converter
displays its arguments:
strip_converter = proc do |field, field_info| p [field, field_info] field.strip end string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Output:
[" foo ", #<struct CSV::FieldInfo index=0, line=1, header=nil>] [" 0 ", #<struct CSV::FieldInfo index=1, line=1, header=nil>] [" bar ", #<struct CSV::FieldInfo index=0, line=2, header=nil>] [" 1 ", #<struct CSV::FieldInfo index=1, line=2, header=nil>] [" baz ", #<struct CSV::FieldInfo index=0, line=3, header=nil>] [" 2 ", #<struct CSV::FieldInfo index=1, line=3, header=nil>]
Each CSV::FieldInfo
object shows:
-
The 0-based field index.
-
The 1-based line index.
-
The field header, if any.
Stored Converters
A converter may be given a name and stored in a structure where the parsing methods can find it by name.
The storage structure for field converters is the Hash CSV::Converters
. It has several built-in converter procs:
-
:integer
: converts each String-embedded integer into a true Integer. -
:float
: converts each String-embedded float into a true Float. -
:date
: converts each String-embedded date into a true Date. -
:date_time
: converts each String-embedded date-time into a true DateTime
. This example creates a converter proc, then stores it:
strip_converter = proc {|field| field.strip } CSV::Converters[:strip] = strip_converter
Then the parsing method call can refer to the converter by its name, :strip
:
string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: :strip) array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The storage structure for header converters is the Hash CSV::HeaderConverters
, which works in the same way. It also has built-in converter procs:
-
:downcase
: Downcases each header. -
:symbol
: Converts each header to a Symbol.
There is no such storage structure for write headers.
In order for the parsing methods to access stored converters in non-main-Ractors, the storage structure must be made shareable first. Therefore, Ractor.make_shareable(CSV::Converters)
and Ractor.make_shareable(CSV::HeaderConverters)
must be called before the creation of Ractors that use the converters stored in these structures. (Since making the storage structures shareable involves freezing them, any custom converters that are to be used must be added first.)
Converter Lists
A converter list is an Array that may include any assortment of:
-
Converter procs.
-
Names of stored converters.
-
Nested converter lists.
Examples:
numeric_converters = [:integer, :float] date_converters = [:date, :date_time] [numeric_converters, strip_converter] [strip_converter, date_converters, :float]
Like a converter proc, a converter list may be named and stored in either CSV::Converters or CSV::HeaderConverters
:
CSV::Converters[:custom] = [strip_converter, date_converters, :float] CSV::HeaderConverters[:custom] = [:downcase, :symbol]
There are two built-in converter lists:
CSV::Converters[:numeric] # => [:integer, :float] CSV::Converters[:all] # => [:date_time, :numeric]
Field Converters
With no conversion, all parsed fields in all rows become Strings:
string = "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(string) ary # => # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
When you specify a field converter, each parsed field is passed to the converter; its return value becomes the stored value for the field. A converter might, for example, convert an integer embedded in a String into a true Integer. (In fact, that’s what built-in field converter :integer
does.)
There are three ways to use field converters.
-
Using option converters with a parsing method:
ary = CSV.parse(string, converters: :integer) ary # => [0, 1, 2] # => [["foo", 0], ["bar", 1], ["baz", 2]]
-
Using option converters with a new CSV instance:
csv = CSV.new(string, converters: :integer) # Field converters in effect: csv.converters # => [:integer] csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
-
Using method
convert
to add a field converter to a CSV instance:csv = CSV.new(string) # Add a converter. csv.convert(:integer) csv.converters # => [:integer] csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
Installing a field converter does not affect already-read rows:
csv = CSV.new(string) csv.shift # => ["foo", "0"] # Add a converter. csv.convert(:integer) csv.converters # => [:integer] csv.read # => [["bar", 1], ["baz", 2]]
There are additional built-in converters, and custom converters are also supported.
Built-In Field Converters
The built-in field converters are in Hash CSV::Converters
:
-
Each key is a field converter name.
-
Each value is one of:
-
A Proc field converter.
-
An Array of field converter names.
-
Display:
CSV::Converters.each_pair do |name, value| if value.kind_of?(Proc) p [name, value.class] else p [name, value] end end
Output:
[:integer, Proc] [:float, Proc] [:numeric, [:integer, :float]] [:date, Proc] [:date_time, Proc] [:all, [:date_time, :numeric]]
Each of these converters transcodes values to UTF-8 before attempting conversion. If a value cannot be transcoded to UTF-8 the conversion will fail and the value will remain unconverted.
Converter :integer
converts each field that Integer() accepts:
data = '0,1,2,x' # Without the converter csv = CSV.parse_line(data) csv # => ["0", "1", "2", "x"] # With the converter csv = CSV.parse_line(data, converters: :integer) csv # => [0, 1, 2, "x"]
Converter :float
converts each field that Float() accepts:
data = '1.0,3.14159,x' # Without the converter csv = CSV.parse_line(data) csv # => ["1.0", "3.14159", "x"] # With the converter csv = CSV.parse_line(data, converters: :float) csv # => [1.0, 3.14159, "x"]
Converter :numeric
converts with both :integer
and :float
..
Converter :date
converts each field that Date::parse
accepts:
data = '2001-02-03,x' # Without the converter csv = CSV.parse_line(data) csv # => ["2001-02-03", "x"] # With the converter csv = CSV.parse_line(data, converters: :date) csv # => [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, "x"]
Converter :date_time
converts each field that DateTime::parse
accepts:
data = '2020-05-07T14:59:00-05:00,x' # Without the converter csv = CSV.parse_line(data) csv # => ["2020-05-07T14:59:00-05:00", "x"] # With the converter csv = CSV.parse_line(data, converters: :date_time) csv # => [#<DateTime: 2020-05-07T14:59:00-05:00 ((2458977j,71940s,0n),-18000s,2299161j)>, "x"]
Converter :numeric
converts with both :date_time
and :numeric
..
As seen above, method convert
adds converters to a CSV instance, and method converters
returns an Array of the converters in effect:
csv = CSV.new('0,1,2') csv.converters # => [] csv.convert(:integer) csv.converters # => [:integer] csv.convert(:date) csv.converters # => [:integer, :date]
Custom Field Converters
You can define a custom field converter:
strip_converter = proc {|field| field.strip } string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
You can register the converter in Converters Hash, which allows you to refer to it by name:
CSV::Converters[:strip] = strip_converter string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: :strip) array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Header Converters
Header converters operate only on headers (and not on other rows).
There are three ways to use header converters; these examples use built-in header converter :downcase
, which downcases each parsed header.
-
Option
header_converters
with a singleton parsing method:string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(string, headers: true, header_converters: :downcase) tbl.class # => CSV::Table tbl.headers # => ["name", "count"]
-
Option
header_converters
with a new CSV instance:csv = CSV.new(string, header_converters: :downcase) # Header converters in effect: csv.header_converters # => [:downcase] tbl = CSV.parse(string, headers: true) tbl.headers # => ["Name", "Count"]
-
Method
header_convert
adds a header converter to a CSV instance:csv = CSV.new(string) # Add a header converter. csv.header_convert(:downcase) csv.header_converters # => [:downcase] tbl = CSV.parse(string, headers: true) tbl.headers # => ["Name", "Count"]
Built-In Header Converters
The built-in header converters are in Hash CSV::HeaderConverters
. The keys there are the names of the converters:
CSV::HeaderConverters.keys # => [:downcase, :symbol]
Converter :downcase
converts each header by downcasing it:
string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(string, headers: true, header_converters: :downcase) tbl.class # => CSV::Table tbl.headers # => ["name", "count"]
Converter :symbol
converts each header by making it into a Symbol:
string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(string, headers: true, header_converters: :symbol) tbl.headers # => [:name, :count]
Details:
-
Strips leading and trailing whitespace.
-
Downcases the header.
-
Replaces embedded spaces with underscores.
-
Removes non-word characters.
-
Makes the string into a Symbol.
Custom Header Converters
You can define a custom header converter:
upcase_converter = proc {|header| header.upcase } string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" table = CSV.parse(string, headers: true, header_converters: upcase_converter) table # => #<CSV::Table mode:col_or_row row_count:4> table.headers # => ["NAME", "VALUE"]
You can register the converter in HeaderConverters Hash, which allows you to refer to it by name:
CSV::HeaderConverters[:upcase] = upcase_converter table = CSV.parse(string, headers: true, header_converters: :upcase) table # => #<CSV::Table mode:col_or_row row_count:4> table.headers # => ["NAME", "VALUE"]
Write Converters
When you specify a write converter for generating CSV, each field to be written is passed to the converter; its return value becomes the new value for the field. A converter might, for example, strip whitespace from a field.
Using no write converter (all fields unmodified):
output_string = CSV.generate do |csv| csv << [' foo ', 0] csv << [' bar ', 1] csv << [' baz ', 2] end output_string # => " foo ,0\n bar ,1\n baz ,2\n"
Using option write_converters
with two custom write converters:
strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field } upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field } write_converters = [strip_converter, upcase_converter] output_string = CSV.generate(write_converters: write_converters) do |csv| csv << [' foo ', 0] csv << [' bar ', 1] csv << [' baz ', 2] end output_string # => "FOO,0\nBAR,1\nBAZ,2\n"
Character Encodings (M17n or Multilingualization)
This new CSV
parser is m17n savvy. The parser works in the Encoding
of the IO
or String
object being read from or written to. Your data is never transcoded (unless you ask Ruby to transcode it for you) and will literally be parsed in the Encoding
it is in. Thus CSV
will return Arrays or Rows of Strings in the Encoding
of your data. This is accomplished by transcoding the parser itself into your Encoding
.
Some transcoding must take place, of course, to accomplish this multiencoding support. For example, :col_sep
, :row_sep
, and :quote_char
must be transcoded to match your data. Hopefully this makes the entire process feel transparent, since CSV’s defaults should just magically work for your data. However, you can set these values manually in the target Encoding
to avoid the translation.
It’s also important to note that while all of CSV’s core parser is now Encoding
agnostic, some features are not. For example, the built-in converters will try to transcode data to UTF-8 before making conversions. Again, you can provide custom converters that are aware of your Encodings to avoid this translation. It’s just too hard for me to support native conversions in all of Ruby’s Encodings.
Anyway, the practical side of this is simple: make sure IO
and String
objects passed into CSV
have the proper Encoding
set and everything should just work. CSV
methods that allow you to open IO
objects (CSV::foreach()
, CSV::open()
, CSV::read()
, and CSV::readlines()
) do allow you to specify the Encoding
.
One minor exception comes when generating CSV
into a String
with an Encoding
that is not ASCII compatible. There’s no existing data for CSV
to use to prepare itself and thus you will probably need to manually specify the desired Encoding
for most of those cases. It will try to guess using the fields in a row of output though, when using CSV::generate_line()
or Array#to_csv().
I try to point out any other Encoding
issues in the documentation of methods as they come up.
This has been tested to the best of my ability with all non-“dummy” Encodings Ruby ships with. However, it is brave new code and may have some bugs. Please feel free to report any issues you find with it.
A FieldInfo
Struct
contains details about a field’s position in the data source it was read from. CSV
will pass this Struct
to some blocks that make decisions based on field structure. See CSV.convert_fields()
for an example.
index
-
The zero-based index of the field in its row.
line
-
The line of the data source this row is from.
header
-
The header for the column, when available.
quoted?
-
True or false, whether the original value is quoted or not.
The encoding used by all converters.
A Hash containing the names and Procs for the built-in field converters. See Built-In Field Converters.
This Hash is intentionally left unfrozen, and may be extended with custom field converters. See Custom Field Converters.
A Hash containing the names and Procs for the built-in header converters. See Built-In Header Converters.
This Hash is intentionally left unfrozen, and may be extended with custom field converters. See Custom Header Converters.
Default values for method options.
The version of the installed library.
:call-seq:
csv.encoding -> encoding
Returns the encoding used for parsing and generating; see Character Encodings (M17n or Multilingualization):
CSV.new('').encoding # => #<Encoding:UTF-8>
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1201
def filter(input=nil, output=nil, **options)
# parse options for input, output, or both
in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
options.each do |key, value|
case key.to_s
when /\Ain(?:put)?_(.+)\Z/
in_options[$1.to_sym] = value
when /\Aout(?:put)?_(.+)\Z/
out_options[$1.to_sym] = value
else
in_options[key] = value
out_options[key] = value
end
end
# build input and output wrappers
input = new(input || ARGF, **in_options)
output = new(output || $stdout, **out_options)
# process headers
need_manual_header_output =
(in_options[:headers] and
out_options[:headers] == true and
out_options[:write_headers])
if need_manual_header_output
first_row = input.shift
if first_row
if first_row.is_a?(Row)
headers = first_row.headers
yield headers
output << headers
end
yield first_row
output << first_row
end
end
# read, yield, write
input.each do |row|
yield row
output << row
end
end
-
Parses CSV from a source (String, IO stream, or
ARGF
). -
Calls the given block with each parsed row:
-
Without headers, each row is an Array.
-
With headers, each row is a
CSV::Row
.
-
-
Generates CSV to an output (String, IO stream, or STDOUT).
-
Returns the parsed source:
-
Without headers, an Array of Arrays.
-
With headers, a
CSV::Table
.
-
When in_string_or_io
is given, but not out_string_or_io
, parses from the given in_string_or_io
and generates to STDOUT.
String input without headers:
in_string = "foo,0\nbar,1\nbaz,2" CSV.filter(in_string) do |row| row[0].upcase! row[1] = - row[1].to_i end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
Output (to STDOUT):
FOO,0 BAR,-1 BAZ,-2
String input with headers:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2" CSV.filter(in_string, headers: true) do |row| row[0].upcase! row[1] = - row[1].to_i end # => #<CSV::Table mode:col_or_row row_count:4>
Output (to STDOUT):
Name,Value FOO,0 BAR,-1 BAZ,-2
IO stream input without headers:
File.write('t.csv', "foo,0\nbar,1\nbaz,2") File.open('t.csv') do |in_io| CSV.filter(in_io) do |row| row[0].upcase! row[1] = - row[1].to_i end end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
Output (to STDOUT):
FOO,0 BAR,-1 BAZ,-2
IO stream input with headers:
File.write('t.csv', "Name,Value\nfoo,0\nbar,1\nbaz,2") File.open('t.csv') do |in_io| CSV.filter(in_io, headers: true) do |row| row[0].upcase! row[1] = - row[1].to_i end end # => #<CSV::Table mode:col_or_row row_count:4>
Output (to STDOUT):
Name,Value FOO,0 BAR,-1 BAZ,-2
When both in_string_or_io
and out_string_or_io
are given, parses from in_string_or_io
and generates to out_string_or_io
.
String output without headers:
in_string = "foo,0\nbar,1\nbaz,2" out_string = '' CSV.filter(in_string, out_string) do |row| row[0].upcase! row[1] = - row[1].to_i end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]] out_string # => "FOO,0\nBAR,-1\nBAZ,-2\n"
String output with headers:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2" out_string = '' CSV.filter(in_string, out_string, headers: true) do |row| row[0].upcase! row[1] = - row[1].to_i end # => #<CSV::Table mode:col_or_row row_count:4> out_string # => "Name,Value\nFOO,0\nBAR,-1\nBAZ,-2\n"
IO stream output without headers:
in_string = "foo,0\nbar,1\nbaz,2" File.open('t.csv', 'w') do |out_io| CSV.filter(in_string, out_io) do |row| row[0].upcase! row[1] = - row[1].to_i end end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]] File.read('t.csv') # => "FOO,0\nBAR,-1\nBAZ,-2\n"
IO stream output with headers:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2" File.open('t.csv', 'w') do |out_io| CSV.filter(in_string, out_io, headers: true) do |row| row[0].upcase! row[1] = - row[1].to_i end end # => #<CSV::Table mode:col_or_row row_count:4> File.read('t.csv') # => "Name,Value\nFOO,0\nBAR,-1\nBAZ,-2\n"
When neither in_string_or_io
nor out_string_or_io
given, parses from ARGF and generates to STDOUT.
Without headers:
# Put Ruby code into a file. ruby = <<-EOT require 'csv' CSV.filter do |row| row[0].upcase! row[1] = - row[1].to_i end EOT File.write('t.rb', ruby) # Put some CSV into a file. File.write('t.csv', "foo,0\nbar,1\nbaz,2") # Run the Ruby code with CSV filename as argument. system(Gem.ruby, "t.rb", "t.csv")
Output (to STDOUT):
FOO,0 BAR,-1 BAZ,-2
With headers:
# Put Ruby code into a file. ruby = <<-EOT require 'csv' CSV.filter(headers: true) do |row| row[0].upcase! row[1] = - row[1].to_i end EOT File.write('t.rb', ruby) # Put some CSV into a file. File.write('t.csv', "Name,Value\nfoo,0\nbar,1\nbaz,2") # Run the Ruby code with CSV filename as argument. system(Gem.ruby, "t.rb", "t.csv")
Output (to STDOUT):
Name,Value FOO,0 BAR,-1 BAZ,-2
Arguments:
-
Argument
in_string_or_io
must be a String or an IO stream. -
Argument
out_string_or_io
must be a String or an IO stream. -
Arguments
**options
must be keyword options. See Options for Parsing.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1331
def foreach(path, mode="r", **options, &block)
return to_enum(__method__, path, mode, **options) unless block_given?
open(path, mode, **options) do |csv|
csv.each(&block)
end
end
Calls the block with each row read from source path_or_io
.
Path input without headers:
string = "foo,0\nbar,1\nbaz,2\n" in_path = 't.csv' File.write(in_path, string) CSV.foreach(in_path) {|row| p row }
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
Path input with headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" in_path = 't.csv' File.write(in_path, string) CSV.foreach(in_path, headers: true) {|row| p row }
Output:
<CSV::Row "Name":"foo" "Value":"0"> <CSV::Row "Name":"bar" "Value":"1"> <CSV::Row "Name":"baz" "Value":"2">
IO stream input without headers:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) File.open('t.csv') do |in_io| CSV.foreach(in_io) {|row| p row } end
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
IO stream input with headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) File.open('t.csv') do |in_io| CSV.foreach(in_io, headers: true) {|row| p row } end
Output:
<CSV::Row "Name":"foo" "Value":"0"> <CSV::Row "Name":"bar" "Value":"1"> <CSV::Row "Name":"baz" "Value":"2">
With no block given, returns an Enumerator:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.foreach(path) # => #<Enumerator: CSV:foreach("t.csv", "r")>
Arguments:
-
Argument
path_or_io
must be a file path or an IO stream. -
Argument
mode
, if given, must be a File mode See Open Mode. -
Arguments
**options
must be keyword options. See Options for Parsing. -
This method optionally accepts an additional
:encoding
option that you can use to specify theEncoding
of the data read frompath
orio
. You must provide this unless your data is in the encoding given byEncoding::default_external
. Parsing will use this to determine how to parse the data. You may provide a secondEncoding
to have the data transcoded as it is read. For example,encoding: 'UTF-32BE:UTF-8'
would read
UTF-32BE
data from the file but transcode it toUTF-8
before parsing.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1397
def generate(str=nil, **options)
encoding = options[:encoding]
# add a default empty String, if none was given
if str
str = StringIO.new(str)
str.seek(0, IO::SEEK_END)
str.set_encoding(encoding) if encoding
else
str = +""
str.force_encoding(encoding) if encoding
end
csv = new(str, **options) # wrap
yield csv # yield for appending
csv.string # return final String
end
-
Argument
csv_string
, if given, must be a String object; defaults to a new empty String. -
Arguments
options
, if given, should be generating options. See Options for Generating.
Creates a new CSV object via CSV.new(csv_string, **options)
; calls the block with the CSV object, which the block may modify; returns the String generated from the CSV object.
Note that a passed String is modified by this method. Pass csv_string
.dup if the String must be preserved.
This method has one additional option: :encoding
, which sets the base Encoding
for the output if no no str
is specified. CSV
needs this hint if you plan to output non-ASCII compatible data.
Add lines:
input_string = "foo,0\nbar,1\nbaz,2\n" output_string = CSV.generate(input_string) do |csv| csv << ['bat', 3] csv << ['bam', 4] end output_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n" input_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n" output_string.equal?(input_string) # => true # Same string, modified
Add lines into new string, preserving old string:
input_string = "foo,0\nbar,1\nbaz,2\n" output_string = CSV.generate(input_string.dup) do |csv| csv << ['bat', 3] csv << ['bam', 4] end output_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n" input_string # => "foo,0\nbar,1\nbaz,2\n" output_string.equal?(input_string) # => false # Different strings
Create lines from nothing:
output_string = CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end output_string # => "foo,0\nbar,1\nbaz,2\n"
Raises an exception if csv_string
is not a String object:
# Raises TypeError (no implicit conversion of Integer into String) CSV.generate(0)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1445
def generate_line(row, **options)
options = {row_sep: InputRecordSeparator.value}.merge(options)
str = +""
if options[:encoding]
str.force_encoding(options[:encoding])
else
fallback_encoding = nil
output_encoding = nil
row.each do |field|
next unless field.is_a?(String)
fallback_encoding ||= field.encoding
next if field.ascii_only?
output_encoding = field.encoding
break
end
output_encoding ||= fallback_encoding
if output_encoding
str.force_encoding(output_encoding)
end
end
(new(str, **options) << row).string
end
Returns the String created by generating CSV from ary
using the specified options
.
Argument ary
must be an Array.
Special options:
-
Option
:row_sep
defaults to"\n"> on Ruby 3.0 or later and <tt>$INPUT_RECORD_SEPARATOR
($/
) otherwise.:$INPUT_RECORD_SEPARATOR # => "\n"
-
This method accepts an additional option,
:encoding
, which sets the baseEncoding
for the output. This method will try to guess yourEncoding
from the first non-nil
field inrow
, if possible, but you may need to use this parameter as a backup plan.
For other options
, see Options for Generating.
Returns the String generated from an Array:
CSV.generate_line(['foo', '0']) # => "foo,0\n"
Raises an exception if ary
is not an Array:
# Raises NoMethodError (undefined method `find' for :foo:Symbol) CSV.generate_line(:foo)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1500
def generate_lines(rows, **options)
self.generate(**options) do |csv|
rows.each do |row|
csv << row
end
end
end
Returns the String created by generating CSV from using the specified options
.
Argument rows
must be an Array of row. Row
is Array of String or CSV::Row.
Special options:
-
Option
:row_sep
defaults to"\n"
on Ruby 3.0 or later and$INPUT_RECORD_SEPARATOR
($/
) otherwise.:$INPUT_RECORD_SEPARATOR # => "\n"
-
This method accepts an additional option,
:encoding
, which sets the baseEncoding
for the output. This method will try to guess yourEncoding
from the first non-nil
field inrow
, if possible, but you may need to use this parameter as a backup plan.
For other options
, see Options for Generating.
Returns the String generated from an
CSV.generate_lines(['foo', '0'], ['bar', '1'], ['baz', '2']) # => "foo,0\nbar,1\nbaz.2\n"
Raises an exception
# Raises NoMethodError (undefined method `find' for :foo:Symbol) CSV.generate_lines(:foo)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1005
def instance(data = $stdout, **options)
# create a _signature_ for this method call, data object and options
sig = [data.object_id] +
options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })
# fetch or create the instance for this signature
@@instances ||= Hash.new
instance = (@@instances[sig] ||= new(data, **options))
if block_given?
yield instance # run block, if given, returning result
else
instance # or return the instance
end
end
Creates or retrieves cached CSV objects. For arguments and options, see CSV.new
.
This API is not Ractor-safe.
With no block given, returns a CSV object.
The first call to instance
creates and caches a CSV object:
s0 = 's0' csv0 = CSV.instance(s0) csv0.class # => CSV
Subsequent calls to instance
with that same string
or io
retrieve that same cached object:
csv1 = CSV.instance(s0) csv1.class # => CSV csv1.equal?(csv0) # => true # Same CSV object
A subsequent call to instance
with a different string
or io
creates and caches a different CSV object.
s1 = 's1' csv2 = CSV.instance(s1) csv2.equal?(csv0) # => false # Different CSV object
All the cached objects remains available:
csv3 = CSV.instance(s0) csv3.equal?(csv0) # true # Same CSV object csv4 = CSV.instance(s1) csv4.equal?(csv2) # true # Same CSV object
When a block is given, calls the block with the created or retrieved CSV object; returns the block’s return value:
CSV.instance(s0) {|csv| :foo } # => :foo
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1904
def initialize(data,
col_sep: ",",
row_sep: :auto,
quote_char: '"',
field_size_limit: nil,
max_field_size: nil,
converters: nil,
unconverted_fields: nil,
headers: false,
return_headers: false,
write_headers: nil,
header_converters: nil,
skip_blanks: false,
force_quotes: false,
skip_lines: nil,
liberal_parsing: false,
internal_encoding: nil,
external_encoding: nil,
encoding: nil,
nil_value: nil,
empty_value: "",
strip: false,
quote_empty: true,
write_converters: nil,
write_nil_value: nil,
write_empty_value: "")
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
if data.is_a?(String)
if encoding
if encoding.is_a?(String)
data_external_encoding, data_internal_encoding = encoding.split(":", 2)
if data_internal_encoding
data = data.encode(data_internal_encoding, data_external_encoding)
else
data = data.dup.force_encoding(data_external_encoding)
end
else
data = data.dup.force_encoding(encoding)
end
end
@io = StringIO.new(data)
else
@io = data
end
@encoding = determine_encoding(encoding, internal_encoding)
@base_fields_converter_options = {
nil_value: nil_value,
empty_value: empty_value,
}
@write_fields_converter_options = {
nil_value: write_nil_value,
empty_value: write_empty_value,
}
@initial_converters = converters
@initial_header_converters = header_converters
@initial_write_converters = write_converters
if max_field_size.nil? and field_size_limit
max_field_size = field_size_limit - 1
end
@parser_options = {
column_separator: col_sep,
row_separator: row_sep,
quote_character: quote_char,
max_field_size: max_field_size,
unconverted_fields: unconverted_fields,
headers: headers,
return_headers: return_headers,
skip_blanks: skip_blanks,
skip_lines: skip_lines,
liberal_parsing: liberal_parsing,
encoding: @encoding,
nil_value: nil_value,
empty_value: empty_value,
strip: strip,
}
@parser = nil
@parser_enumerator = nil
@eof_error = nil
@writer_options = {
encoding: @encoding,
force_encoding: (not encoding.nil?),
force_quotes: force_quotes,
headers: headers,
write_headers: write_headers,
column_separator: col_sep,
row_separator: row_sep,
quote_character: quote_char,
quote_empty: quote_empty,
}
@writer = nil
writer if @writer_options[:write_headers]
end
Returns the new CSV object created using string
or io
and the specified options
.
-
Argument
string
should be a String object; it will be put into a newStringIO
object positioned at the beginning. -
Argument
io
should be anIO
object that is:-
Open for reading; on return, the
IO
object will be closed. -
Positioned at the beginning. To position at the end, for appending, use method
CSV.generate
. For any other positioning, pass a preset StringIO object instead.
-
-
Argument
options
: See:For performance reasons, the options cannot be overridden in a CSV object, so those specified here will endure.
In addition to the CSV instance methods, several IO methods are delegated. See Delegated Methods.
Create a CSV object from a String object:
csv = CSV.new('foo,0') csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Create a CSV object from a File object:
File.write('t.csv', 'foo,0') csv = CSV.new(File.open('t.csv')) csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Raises an exception if the argument is nil
:
# Raises ArgumentError (Cannot parse nil as CSV): CSV.new(nil)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1580
def open(filename, mode="r", **options)
# wrap a File opened with the remaining +args+ with no newline
# decorator
file_opts = options.dup
unless file_opts.key?(:newline)
file_opts[:universal_newline] ||= false
end
options.delete(:invalid)
options.delete(:undef)
options.delete(:replace)
options.delete_if {|k, _| /newline\z/.match?(k)}
begin
f = File.open(filename, mode, **file_opts)
rescue ArgumentError => e
raise unless /needs binmode/.match?(e.message) and mode == "r"
mode = "rb"
file_opts = {encoding: Encoding.default_external}.merge(file_opts)
retry
end
begin
csv = new(f, **options)
rescue Exception
f.close
raise
end
# handle blocks like Ruby's open(), not like the CSV library
if block_given?
begin
yield csv
ensure
csv.close
end
else
csv
end
end
possible options elements:
keyword form: :invalid => nil # raise error on invalid byte sequence (default) :invalid => :replace # replace invalid byte sequence :undef => :replace # replace undefined conversion :replace => string # replacement string ("?" or "\uFFFD" if not specified)
-
Argument
path
, if given, must be the path to a file. -
Argument
io
should be anIO
object that is:-
Open for reading; on return, the
IO
object will be closed. -
Positioned at the beginning. To position at the end, for appending, use method
CSV.generate
. For any other positioning, pass a preset StringIO object instead.
-
-
Argument
mode
, if given, must be a File mode See Open Mode. -
Arguments
**options
must be keyword options. See Options for Generating. -
This method optionally accepts an additional
:encoding
option that you can use to specify theEncoding
of the data read frompath
orio
. You must provide this unless your data is in the encoding given byEncoding::default_external
. Parsing will use this to determine how to parse the data. You may provide a secondEncoding
to have the data transcoded as it is read. For example,encoding: 'UTF-32BE:UTF-8'
would read
UTF-32BE
data from the file but transcode it toUTF-8
before parsing.
These examples assume prior execution of:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
With no block given, returns a new CSV object.
Create a CSV object using a file path:
csv = CSV.open(path) csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Create a CSV object using an open File:
csv = CSV.open(File.open(path)) csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
With a block given, calls the block with the created CSV object; returns the block’s return value:
Using a file path:
csv = CSV.open(path) {|csv| p csv} csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Output:
#<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Using an open File:
csv = CSV.open(File.open(path)) {|csv| p csv} csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Output:
#<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
Raises an exception if the argument is not a String object or IO object:
# Raises TypeError (no implicit conversion of Symbol into String) CSV.open(:foo)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1731
def parse(str, **options, &block)
csv = new(str, **options)
return csv.each(&block) if block_given?
# slurp contents, if no block is given
begin
csv.read
ensure
csv.close
end
end
Parses string
or io
using the specified options
.
-
Argument
string
should be a String object; it will be put into a newStringIO
object positioned at the beginning. -
Argument
io
should be anIO
object that is:-
Open for reading; on return, the
IO
object will be closed. -
Positioned at the beginning. To position at the end, for appending, use method
CSV.generate
. For any other positioning, pass a preset StringIO object instead.
-
-
Argument
options
: see Options for Parsing
Without Option headers
Without {option headers
} case.
These examples assume prior execution of:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
With no block given, returns an Array of Arrays formed from the source.
Parse a String:
a_of_a = CSV.parse(string) a_of_a # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Parse an open File:
a_of_a = File.open(path) do |file| CSV.parse(file) end a_of_a # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
With a block given, calls the block with each parsed row:
Parse a String:
CSV.parse(string) {|row| p row }
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
Parse an open File:
File.open(path) do |file| CSV.parse(file) {|row| p row } end
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
With Option headers
With {option headers
} case.
These examples assume prior execution of:
string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
With no block given, returns a CSV::Table
object formed from the source.
Parse a String:
csv_table = CSV.parse(string, headers: ['Name', 'Count']) csv_table # => #<CSV::Table mode:col_or_row row_count:5>
Parse an open File:
csv_table = File.open(path) do |file| CSV.parse(file, headers: ['Name', 'Count']) end csv_table # => #<CSV::Table mode:col_or_row row_count:4>
With a block given, calls the block with each parsed row, which has been formed into a CSV::Row
object:
Parse a String:
CSV.parse(string, headers: ['Name', 'Count']) {|row| p row }
Output:
# <CSV::Row "Name":"foo" "Count":"0"> # <CSV::Row "Name":"bar" "Count":"1"> # <CSV::Row "Name":"baz" "Count":"2">
Parse an open File:
File.open(path) do |file| CSV.parse(file, headers: ['Name', 'Count']) {|row| p row } end
Output:
# <CSV::Row "Name":"foo" "Count":"0"> # <CSV::Row "Name":"bar" "Count":"1"> # <CSV::Row "Name":"baz" "Count":"2">
Raises an exception if the argument is not a String object or IO object:
# Raises NoMethodError (undefined method `close' for :foo:Symbol) CSV.parse(:foo)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1804
def parse_line(line, **options)
new(line, **options).each.first
end
Returns the data created by parsing the first line of string
or io
using the specified options
.
-
Argument
string
should be a String object; it will be put into a newStringIO
object positioned at the beginning. -
Argument
io
should be anIO
object that is:-
Open for reading; on return, the
IO
object will be closed. -
Positioned at the beginning. To position at the end, for appending, use method
CSV.generate
. For any other positioning, pass a preset StringIO object instead.
-
-
Argument
options
: see Options for Parsing
Without Option headers
Without option headers
, returns the first row as a new Array.
These examples assume prior execution of:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
Parse the first line from a String object:
CSV.parse_line(string) # => ["foo", "0"]
Parse the first line from a File
object:
File.open(path) do |file| CSV.parse_line(file) # => ["foo", "0"] end # => ["foo", "0"]
Returns nil
if the argument is an empty String:
CSV.parse_line('') # => nil
With Option headers
With {option headers
}, returns the first row as a CSV::Row
object.
These examples assume prior execution of:
string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
Parse the first line from a String object:
CSV.parse_line(string, headers: true) # => #<CSV::Row "Name":"foo" "Count":"0">
Parse the first line from a File
object:
File.open(path) do |file| CSV.parse_line(file, headers: true) end # => #<CSV::Row "Name":"foo" "Count":"0">
Raises an exception if the argument is nil
:
# Raises ArgumentError (Cannot parse nil as CSV): CSV.parse_line(nil)
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1828
def read(path, **options)
open(path, **options) { |csv| csv.read }
end
Opens the given source
with the given options
(see CSV.open
), reads the source (see CSV#read
), and returns the result, which will be either an Array of Arrays or a CSV::Table
.
Without headers:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
With headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.read(path, headers: true) # => #<CSV::Table mode:col_or_row row_count:4>
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1836
def readlines(path, **options)
read(path, **options)
end
Alias for CSV.read
.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 1855
def table(path, **options)
default_options = {
headers: true,
converters: :numeric,
header_converters: :symbol,
}
options = default_options.merge(options)
read(path, **options)
end
Calls CSV.read
with source
, options
, and certain default options:
-
headers
:true
-
converters
::numeric
-
header_converters
::symbol
Returns a CSV::Table
object.
Example:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.table(path) # => #<CSV::Table mode:col_or_row row_count:4>
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2371
def <<(row)
writer << row
self
end
Appends a row to self
.
-
Argument
row
must be an Array object or aCSV::Row
object. -
The output stream must be open for writing.
Append Arrays:
CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end # => "foo,0\nbar,1\nbaz,2\n"
Append CSV::Rows:
headers = [] CSV.generate do |csv| csv << CSV::Row.new(headers, ['foo', 0]) csv << CSV::Row.new(headers, ['bar', 1]) csv << CSV::Row.new(headers, ['baz', 2]) end # => "foo,0\nbar,1\nbaz,2\n"
Headers in CSV::Row
objects are not appended:
headers = ['Name', 'Count'] CSV.generate do |csv| csv << CSV::Row.new(headers, ['foo', 0]) csv << CSV::Row.new(headers, ['bar', 1]) csv << CSV::Row.new(headers, ['baz', 2]) end # => "foo,0\nbar,1\nbaz,2\n"
Raises an exception if row
is not an Array or CSV::Row:
CSV.generate do |csv| # Raises NoMethodError (undefined method `collect' for :foo:Symbol) csv << :foo end
Raises an exception if the output stream is not opened for writing:
path = 't.csv' File.write(path, '') File.open(path) do |file| CSV.open(file) do |csv| # Raises IOError (not opened for writing) csv << ['foo', 0] end end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2260
def binmode?
if @io.respond_to?(:binmode?)
@io.binmode?
else
false
end
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2815
def build_fields_converter(initial_converters, options)
fields_converter = FieldsConverter.new(options)
normalize_converters(initial_converters).each do |name, converter|
fields_converter.add_converter(name, &converter)
end
fields_converter
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2797
def build_header_fields_converter
specific_options = {
builtin_converters_name: :HeaderConverters,
accept_nil: true,
}
options = @base_fields_converter_options.merge(specific_options)
build_fields_converter(@initial_header_converters, options)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2785
def build_parser_fields_converter
specific_options = {
builtin_converters_name: :Converters,
}
options = @base_fields_converter_options.merge(specific_options)
build_fields_converter(@initial_converters, options)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2810
def build_writer_fields_converter
build_fields_converter(@initial_write_converters,
@write_fields_converter_options)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2008
def col_sep
parser.column_separator
end
Returns the encoded column separator; used for parsing and writing; see {Option col_sep
}:
CSV.new('').col_sep # => ","
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2442
def convert(name = nil, &converter)
parser_fields_converter.add_converter(name, &converter)
end
-
With no block, installs a field converter (a Proc).
-
With a block, defines and installs a custom field converter.
-
Returns the Array of installed field converters.
-
Argument
converter_name
, if given, should be the name of an existing field converter.
See Field Converters.
With no block, installs a field converter:
csv = CSV.new('') csv.convert(:integer) csv.convert(:float) csv.convert(:date) csv.converters # => [:integer, :float, :date]
The block, if given, is called for each field:
-
Argument
field
is the field value. -
Argument
field_info
is aCSV::FieldInfo
object containing details about the field.
The examples here assume the prior execution of:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)
Example giving a block:
csv = CSV.open(path) csv.convert {|field, field_info| p [field, field_info]; field.upcase } csv.read # => [["FOO", "0"], ["BAR", "1"], ["BAZ", "2"]]
Output:
["foo", #<struct CSV::FieldInfo index=0, line=1, header=nil>] ["0", #<struct CSV::FieldInfo index=1, line=1, header=nil>] ["bar", #<struct CSV::FieldInfo index=0, line=2, header=nil>] ["1", #<struct CSV::FieldInfo index=1, line=2, header=nil>] ["baz", #<struct CSV::FieldInfo index=0, line=3, header=nil>] ["2", #<struct CSV::FieldInfo index=1, line=3, header=nil>]
The block need not return a String object:
csv = CSV.open(path) csv.convert {|field, field_info| field.to_sym } csv.read # => [[:foo, :"0"], [:bar, :"1"], [:baz, :"2"]]
If converter_name
is given, the block is not called:
csv = CSV.open(path) csv.convert(:integer) {|field, field_info| fail 'Cannot happen' } csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
Raises a parse-time exception if converter_name
is not the name of a built-in field converter:
csv = CSV.open(path) csv.convert(:nosuch) => [nil] # Raises NoMethodError (undefined method `arity' for nil:NilClass) csv.read
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2760
def convert_fields(fields, headers = false)
if headers
header_fields_converter.convert(fields, nil, 0)
else
parser_fields_converter.convert(fields, @headers, lineno)
end
end
Processes fields
with @converters
, or @header_converters
if headers
is passed as true
, returning the converted field set. Any converter that changes the field into something other than a String
halts the pipeline of conversion for that field. This is primarily an efficiency shortcut.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2081
def converters
parser_fields_converter.map do |converter|
name = Converters.rassoc(converter)
name ? name.first : converter
end
end
Returns an Array containing field converters; see Field Converters:
csv = CSV.new('') csv.converters # => [] csv.convert(:integer) csv.converters # => [:integer] csv.convert(proc {|x| x.to_s }) csv.converters
Notes that you need to call +Ractor.make_shareable(CSV::Converters
)+ on the main Ractor
to use this method.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2723
def determine_encoding(encoding, internal_encoding)
# honor the IO encoding if we can, otherwise default to ASCII-8BIT
io_encoding = raw_encoding
return io_encoding if io_encoding
return Encoding.find(internal_encoding) if internal_encoding
if encoding
encoding, = encoding.split(":", 2) if encoding.is_a?(String)
return Encoding.find(encoding)
end
Encoding.default_internal || Encoding.default_external
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2553
def each(&block)
parser_enumerator.each(&block)
end
Calls the block with each successive row. The data source must be opened for reading.
Without headers:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.each do |row| p row end
Output:
["foo", "0"] ["bar", "1"] ["baz", "2"]
With headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) csv.each do |row| p row end
Output:
<CSV::Row "Name":"foo" "Value":"0"> <CSV::Row "Name":"bar" "Value":"1"> <CSV::Row "Name":"baz" "Value":"2">
Raises an exception if the source is not opened for reading:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close # Raises IOError (not opened for reading) csv.each do |row| p row end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2296
def eof?
return false if @eof_error
begin
parser_enumerator.peek
false
rescue MalformedCSVError => error
@eof_error = error
false
rescue StopIteration
true
end
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2040
def field_size_limit
parser.field_size_limit
end
Returns the limit for field size; used for parsing; see {Option field_size_limit
}:
CSV.new('').field_size_limit # => nil
Deprecated since 3.2.3. Use max_field_size
instead.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2268
def flock(*args)
raise NotImplementedError unless @io.respond_to?(:flock)
@io.flock(*args)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2171
def force_quotes?
@writer_options[:force_quotes]
end
Returns the value that determines whether all output fields are to be quoted; used for generating; see {Option force_quotes
}:
CSV.new('').force_quotes? # => false
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2508
def header_convert(name = nil, &converter)
header_fields_converter.add_converter(name, &converter)
end
The block need not return a String object:
csv = CSV.open(path, headers: true) csv.header_convert {|header, field_info| header.to_sym } table = csv.read table.headers # => [:Name, :Value]
If converter_name
is given, the block is not called:
csv = CSV.open(path, headers: true) csv.header_convert(:downcase) {|header, field_info| fail 'Cannot happen' } table = csv.read table.headers # => ["name", "value"]
Raises a parse-time exception if converter_name
is not the name of a built-in field converter:
csv = CSV.open(path, headers: true) csv.header_convert(:nosuch) # Raises NoMethodError (undefined method `arity' for nil:NilClass) csv.read
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2147
def header_converters
header_fields_converter.map do |converter|
name = HeaderConverters.rassoc(converter)
name ? name.first : converter
end
end
Returns an Array containing header converters; used for parsing; see Header Converters:
CSV.new('').header_converters # => []
Notes that you need to call +Ractor.make_shareable(CSV::HeaderConverters
)+ on the main Ractor
to use this method.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2793
def header_fields_converter
@header_fields_converter ||= build_header_fields_converter
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2624
def header_row?
parser.header_row?
end
Returns true
if the next row to be read is a header row; false
otherwise.
Without headers:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.header_row? # => false
With headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) csv.header_row? # => true csv.shift # => #<CSV::Row "Name":"foo" "Value":"0"> csv.header_row? # => false
Raises an exception if the source is not opened for reading:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close # Raises IOError (not opened for reading) csv.header_row?
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2105
def headers
if @writer
@writer.headers
else
parsed_headers = parser.headers
return parsed_headers if parsed_headers
raw_headers = @parser_options[:headers]
raw_headers = nil if raw_headers == false
raw_headers
end
end
Returns the value that determines whether headers are used; used for parsing; see {Option headers
}:
CSV.new('').headers # => nil
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2683
def inspect
str = ["#<", self.class.to_s, " io_type:"]
# show type of wrapped IO
if @io == $stdout then str << "$stdout"
elsif @io == $stdin then str << "$stdin"
elsif @io == $stderr then str << "$stderr"
else str << @io.class.to_s
end
# show IO.path(), if available
if @io.respond_to?(:path) and (p = @io.path)
str << " io_path:" << p.inspect
end
# show encoding
str << " encoding:" << @encoding.name
# show other attributes
["lineno", "col_sep", "row_sep", "quote_char"].each do |attr_name|
if a = __send__(attr_name)
str << " " << attr_name << ":" << a.inspect
end
end
["skip_blanks", "liberal_parsing"].each do |attr_name|
if a = __send__("#{attr_name}?")
str << " " << attr_name << ":" << a.inspect
end
end
_headers = headers
str << " headers:" << _headers.inspect if _headers
str << ">"
begin
str.join('')
rescue # any encoding error
str.map do |s|
e = Encoding::Converter.asciicompat_encoding(s.encoding)
e ? s.encode(e) : s.force_encoding("ASCII-8BIT")
end.join('')
end
end
Returns a String showing certain properties of self
:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) s = csv.inspect s # => "#<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:\",\" row_sep:\"\\n\" quote_char:\"\\\"\" headers:true>"
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2273
def ioctl(*args)
raise NotImplementedError unless @io.respond_to?(:ioctl)
@io.ioctl(*args)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2181
def liberal_parsing?
parser.liberal_parsing?
end
Returns the value that determines whether illegal input is to be handled; used for parsing; see {Option liberal_parsing
}:
CSV.new('').liberal_parsing? # => false
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2246
def line
parser.line
end
Returns the line most recently read:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.open(path) do |csv| csv.each do |row| p [csv.lineno, csv.line] end end
Output:
[1, "foo,0\n"] [2, "bar,1\n"] [3, "baz,2\n"]
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2222
def lineno
if @writer
@writer.lineno
else
parser.lineno
end
end
Returns the count of the rows parsed or generated.
Parsing:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.open(path) do |csv| csv.each do |row| p [csv.lineno, row] end end
Output:
[1, ["foo", "0"]] [2, ["bar", "1"]] [3, ["baz", "2"]]
Generating:
CSV.generate do |csv| p csv.lineno; csv << ['foo', 0] p csv.lineno; csv << ['bar', 1] p csv.lineno; csv << ['baz', 2] end
Output:
0 1 2
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2052
def max_field_size
parser.max_field_size
end
Returns the limit for field size; used for parsing; see {Option max_field_size
}:
CSV.new('').max_field_size # => nil
Since 3.2.3.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2738
def normalize_converters(converters)
converters ||= []
unless converters.is_a?(Array)
converters = [converters]
end
converters.collect do |converter|
case converter
when Proc # custom code block
[nil, converter]
else # by name
[converter, nil]
end
end
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2823
def parser
@parser ||= Parser.new(@io, parser_options)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2832
def parser_enumerator
@parser_enumerator ||= parser.parse
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2781
def parser_fields_converter
@parser_fields_converter ||= build_parser_fields_converter
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2827
def parser_options
@parser_options.merge(header_fields_converter: header_fields_converter,
fields_converter: parser_fields_converter)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2278
def path
@io.path if @io.respond_to?(:path)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2028
def quote_char
parser.quote_character
end
Returns the encoded quote character; used for parsing and writing; see {Option quote_char
}:
CSV.new('').quote_char # => "\""
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2771
def raw_encoding
if @io.respond_to? :internal_encoding
@io.internal_encoding || @io.external_encoding
elsif @io.respond_to? :encoding
@io.encoding
else
nil
end
end
Returns the encoding of the internal IO
object.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2588
def read
rows = to_a
if parser.use_headers?
Table.new(rows, headers: parser.headers)
else
rows
end
end
Forms the remaining rows from self
into:
-
A
CSV::Table
object, if headers are in use. -
An Array of Arrays, otherwise.
The data source must be opened for reading.
Without headers:
string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) csv = CSV.open(path) csv.read # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
With headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) csv = CSV.open(path, headers: true) csv.read # => #<CSV::Table mode:col_or_row row_count:4>
Raises an exception if the source is not opened for reading:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close # Raises IOError (not opened for reading) csv.read
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2123
def return_headers?
parser.return_headers?
end
Returns the value that determines whether headers are to be returned; used for parsing; see {Option return_headers
}:
CSV.new('').return_headers? # => false
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2311
def rewind
@parser = nil
@parser_enumerator = nil
@eof_error = nil
@writer.rewind if @writer
@io.rewind
end
Rewinds the underlying IO
object and resets CSV’s lineno() counter.
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2018
def row_sep
parser.row_separator
end
Returns the encoded row separator; used for parsing and writing; see {Option row_sep
}:
CSV.new('').row_sep # => "\n"
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2661
def shift
if @eof_error
eof_error, @eof_error = @eof_error, nil
raise eof_error
end
begin
parser_enumerator.next
rescue StopIteration
nil
end
end
Returns the next row of data as:
-
An Array if no headers are used.
-
A
CSV::Row
object if headers are used.
The data source must be opened for reading.
Without headers:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.shift # => ["foo", "0"] csv.shift # => ["bar", "1"] csv.shift # => ["baz", "2"] csv.shift # => nil
With headers:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) csv.shift # => #<CSV::Row "Name":"foo" "Value":"0"> csv.shift # => #<CSV::Row "Name":"bar" "Value":"1"> csv.shift # => #<CSV::Row "Name":"baz" "Value":"2"> csv.shift # => nil
Raises an exception if the source is not opened for reading:
string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close # Raises IOError (not opened for reading) csv.shift
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2160
def skip_blanks?
parser.skip_blanks?
end
Returns the value that determines whether blank lines are to be ignored; used for parsing; see {Option skip_blanks
}:
CSV.new('').skip_blanks? # => false
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2062
def skip_lines
parser.skip_lines
end
Returns the Regexp used to identify comment lines; used for parsing; see {Option skip_lines
}:
CSV.new('').skip_lines # => nil
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2282
def stat(*args)
raise NotImplementedError unless @io.respond_to?(:stat)
@io.stat(*args)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2287
def to_i
raise NotImplementedError unless @io.respond_to?(:to_i)
@io.to_i
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2292
def to_io
@io.respond_to?(:to_io) ? @io.to_io : @io
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2095
def unconverted_fields?
parser.unconverted_fields?
end
Returns the value that determines whether unconverted fields are to be available; used for parsing; see {Option unconverted_fields
}:
CSV.new('').unconverted_fields? # => nil
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2133
def write_headers?
@writer_options[:write_headers]
end
Returns the value that determines whether headers are to be written; used for generating; see {Option write_headers
}:
CSV.new('').write_headers? # => nil
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2836
def writer
@writer ||= Writer.new(@io, writer_options)
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2806
def writer_fields_converter
@writer_fields_converter ||= build_writer_fields_converter
end
# File tmp/rubies/ruby-3.1.3/lib/csv.rb, line 2840
def writer_options
@writer_options.merge(header_fields_converter: header_fields_converter,
fields_converter: writer_fields_converter)
end