PHP 8.4.3 Released!

str_getcsv

(PHP 5 >= 5.3.0, PHP 7, PHP 8)

str_getcsv Разбирает CSV-строку в массив

Описание

str_getcsv(
    string $string,
    string $separator = ",",
    string $enclosure = "\"",
    string $escape = "\\"
): array

Функция разбирает входную строку по полям в формате CSV и возвращает массив c прочитанными полями.

Замечание: Функция учитывает региональный настройки. Поэтому функция иногда неправильно разбирает данные в отдельных однобайтовых кодировках, если значение константы LC_CTYPE равно en_US.UTF-8.

Список параметров

string

Строка для разбора.

separator

Параметр separator устанавливает символ-разделитель полей и принимает только один однобайтовый символ.

enclosure

Параметр enclosure устанавливает символ-ограничитель значения поля и принимает только один однобайтовый символ.

escape

Параметр escape устанавливает символ экранирования и принимает только один однобайтовый символ или пустую строку. Пустая строка "" отключает внутренний механизм экранирования.

Замечание: Обычно символ ограничителя значений — enclosure экранируется внутри поля путём удвоения; однако как альтернативу разрешается использовать символ экранирования escape. Поэтому для стандартных значений параметра смысл значений "" и \" одинаков. Символ экранирования — escape не несёт отдельного смысла, кроме экранирования символа ограничителя значений — enclosure; он даже не экранирует сам себя.

Внимание

Начиная с PHP 8.4.0 полагаться на значение по умолчанию, которое содержит параметр escape, не рекомендуют. Значение потребуется указать явно, позиционно или как именованный аргумент.

Внимание

Строка в CSV-формате иногда перестаёт соответствовать стандарту » RFC 4180 или не выдерживает обмена информацией с PHP-функциями для работы с CSV-строками, если для символа экранирования escape устанавливают значение, которое отличается от пустой строки "". Значение по умолчанию для параметра escape"\\", поэтому рекомендуют явно указывать пустую строку. Значение по умолчанию изменят в будущей версии PHP, но не раньше PHP 9.0.

Возвращаемые значения

Функция возвращает индексный массив, который содержит прочитанные поля.

Ошибки

Функция выбрасывает ошибку ValueError, если аргументы для разделителя полей separator или ограничителя значений enclosure содержат значение короче одного байта.

Функция выбрасывает ошибку ValueError, если длина значения аргумента escape не равна одному байту или передали пустую строку.

Список изменений

Версия Описание
8.4.0 Вызов функции без явной передачи значения в параметр escape устарел.
8.4.0 Функция теперь подражает поведению функций fgetcsv() и fputcsv() и выбрасывает ошибку ValueError, если в параметры separator, enclosure или escape передали недопустимое значение.
8.3.0 Вместо строки с одним нулевым байтом возвращается пустая строка, если последнее поле содержит только незавершённый символ ограничения значения поля. enclosure.
7.4.0 Функция теперь интерпретирует пустой параметр escape как требование отключить внутренний механизм экранирования. Раньше пустую строку функция рассматривала как значение по умолчанию для параметра.

Примеры

Пример #1 Пример разбора CSV-строки в массив функцией str_getcsv()

<?php

$string
= 'PHP,Java,Python,Kotlin,Swift';
$data = str_getcsv($string);

var_dump($data);

?>

Результат выполнения приведённого примера:

array(5) {
  [0]=>
  string(3) "PHP"
  [1]=>
  string(4) "Java"
  [2]=>
  string(6) "Python"
  [3]=>
  string(6) "Kotlin"
  [4]=>
  string(5) "Swift"
}

Пример #2 Пример работы функции str_getcsv() с пустой строкой

Предостережение

Для пустой строки функция вместо пустого массива возвращает значение [null].

<?php

$string
= '';
$data = str_getcsv($string);

var_dump($data);

?>

Результат выполнения приведённого примера:

array(1) {
  [0]=>
  NULL
}

Смотрите также

  • fputcsv() - Формирует строку в CSV-формате и записывает строку в файловый указатель
  • fgetcsv() - Получает строку из файлового указателя и разбирает по CSV-полям
  • SplFileObject::fgetcsv() - Получает строку из файлового указателя и разбирает по CSV-полям
  • SplFileObject::fputcsv() - Записывает массив полей как CSV-строки
  • SplFileObject::setCsvControl() - Устанавливает символы разделителя, ограничителя и экранирования для CSV-полей
  • SplFileObject::getCsvControl() - Получает символы разделителя, ограничителя и экранирования CSV-полей
Добавить

Примечания пользователей 30 notes

up
500
james at moss dot io
10 years ago
[Editor's Note (cmb): that does not produce the desired results, if fields contain linebreaks.]

Handy one liner to parse a CSV file into an array

<?php

$csv
= array_map('str_getcsv', file('data.csv'));

?>
up
185
starrychloe at oliveyou dot net
9 years ago
Based on James' line, this will create an array of associative arrays with the first row column headers as the keys.

<?php
$csv
= array_map('str_getcsv', file($file));
array_walk($csv, function(&$a) use ($csv) {
$a = array_combine($csv[0], $a);
});
array_shift($csv); # remove column header
?>

This will yield something like
[2] => Array
(
[Campaign ID] => 295095038
[Ad group ID] => 22460178158
[Keyword ID] => 3993587178
up
142
durik at 3ilab dot net
14 years ago
As the str_getcsv(), unlike to fgetcsv(), does not parse the rows in CSV string, I have found following easy workaround:

<?php
$Data
= str_getcsv($CsvString, "\n"); //parse the rows
foreach($Data as &$Row) $Row = str_getcsv($Row, ";"); //parse the items in rows
?>

Why not use explode() instead of str_getcsv() to parse rows? Because explode() would not treat possible enclosured parts of string or escaped characters correctly.
up
13
sven at e7o dot de
9 years ago
PHP is failing when parsing UTF-8 with Byte Order Mark. Strip it with this one from string before passing it to csv parser:

<?php
$bom
= pack('CCC', 0xEF, 0xBB, 0xBF);
if (
strncmp($yourString, $bom, 3) === 0) {
$body = substr($yourString, 3);
}
?>
up
43
normadize -a- gmail -d- com
11 years ago
Like some other users here noted, str_getcsv() cannot be used if you want to comply with either the RFC or with most spreadsheet tools like Excel or Google Docs.

These tools do not escape commas or new lines, but instead place double-quotes (") around the field. If there are any double-quotes in the field, these are escaped with another double-quote (" becomes ""). All this may look odd, but it is what the RFC and most tools do ...

For instance, try exporting as .csv a Google Docs spreadsheet (File > Download as > .csv) which has new lines and commas as part of the field values and see how the .csv content looks, then try to parse it using str_getcsv() ... it will spectacularly regardless of the arguments you pass to it.

Here is a function that can handle everything correctly, and more:

- doesn't use any for or while loops,
- it allows for any separator (any string of any length),
- option to skip empty lines,
- option to trim fields,
- can handle UTF8 data too (although .csv files are likely non-unicode).

Here is the more human readable version of the function:

<?php

// returns a two-dimensional array or rows and fields

function parse_csv ($csv_string, $delimiter = ",", $skip_empty_lines = true, $trim_fields = true)
{
$enc = preg_replace('/(?<!")""/', '!!Q!!', $csv_string);
$enc = preg_replace_callback(
'/"(.*?)"/s',
function (
$field) {
return
urlencode(utf8_encode($field[1]));
},
$enc
);
$lines = preg_split($skip_empty_lines ? ($trim_fields ? '/( *\R)+/s' : '/\R+/s') : '/\R/s', $enc);
return
array_map(
function (
$line) use ($delimiter, $trim_fields) {
$fields = $trim_fields ? array_map('trim', explode($delimiter, $line)) : explode($delimiter, $line);
return
array_map(
function (
$field) {
return
str_replace('!!Q!!', '"', utf8_decode(urldecode($field)));
},
$fields
);
},
$lines
);
}

?>

Since this is not using any loops, you can actually write it as a one-line statement (one-liner).

Here's the function using just one line of code for the function body, formatted nicely though:

<?php

// returns the same two-dimensional array as above, but with a one-liner code

function parse_csv ($csv_string, $delimiter = ",", $skip_empty_lines = true, $trim_fields = true)
{
return
array_map(
function (
$line) use ($delimiter, $trim_fields) {
return
array_map(
function (
$field) {
return
str_replace('!!Q!!', '"', utf8_decode(urldecode($field)));
},
$trim_fields ? array_map('trim', explode($delimiter, $line)) : explode($delimiter, $line)
);
},
preg_split(
$skip_empty_lines ? ($trim_fields ? '/( *\R)+/s' : '/\R+/s') : '/\R/s',
preg_replace_callback(
'/"(.*?)"/s',
function (
$field) {
return
urlencode(utf8_encode($field[1]));
},
$enc = preg_replace('/(?<!")""/', '!!Q!!', $csv_string)
)
)
);
}

?>

Replace !!Q!! with another placeholder if you wish.

Have fun.
up
85
Jay Williams
14 years ago
Here is a quick and easy way to convert a CSV file to an associated array:

<?php
/**
* @link http://gist.github.com/385876
*/
function csv_to_array($filename='', $delimiter=',')
{
if(!
file_exists($filename) || !is_readable($filename))
return
FALSE;

$header = NULL;
$data = array();
if ((
$handle = fopen($filename, 'r')) !== FALSE)
{
while ((
$row = fgetcsv($handle, 1000, $delimiter)) !== FALSE)
{
if(!
$header)
$header = $row;
else
$data[] = array_combine($header, $row);
}
fclose($handle);
}
return
$data;
}

?>
up
17
dejiakala at gmail dot com
10 years ago
I wanted the best of the 2 solutions by james at moss dot io and Jay Williams (csv_to_array()) - create associative array from a CSV file with a header row.

<?php

$array
= array_map('str_getcsv', file('data.csv'));

$header = array_shift($array);

array_walk($array, '_combine_array', $header);

function
_combine_array(&$row, $key, $header) {
$row = array_combine($header, $row);
}

?>

Then I thought why not try some benchmarking? I grabbed a sample CSV file with 50,000 rows (10 columns each) and Vulcan Logic Disassembler (VLD) which hooks into the Zend Engine and dumps all the opcodes (execution units) of a script - see http://pecl.php.net/package/vld and example here: http://fabien.potencier.org/article/8/print-vs-echo-which-one-is-faster

Result:

array_walk() and array_map() - 39 opcodes
csv_to_array() - 69 opcodes
up
3
daniel dot oconnor at gmail dot com
15 years ago
Don't have this? Ask fgetcsv() to do it for you.

5.1.0+

<?php
if (!function_exists('str_getcsv')) {
function
str_getcsv($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
$fiveMBs = 5 * 1024 * 1024;
$fp = fopen("php://temp/maxmemory:$fiveMBs", 'r+');
fputs($fp, $input);
rewind($fp);

$data = fgetcsv($fp, 1000, $delimiter, $enclosure); // $escape only got added in 5.3.0

fclose($fp);
return
$data;
}
}
?>
up
25
Ryan Rubley
11 years ago
@normadize - that is a nice start, but it fails on situations where a field is empty but quoted (returning a string with one double quote instead of an empty string) and cases like """""foo""""" that should result in ""foo"" but instead return "foo". I also get a row with 1 empty field at the end because of the final CRLF in the CSV. Plus, I don't really like the !!Q!! magic or urlencoding to get around things. Also, \R doesn't work in pcre on any of my php installations.

Here is my take on this, without anonymous functions (so it works on PHP < 5.3), and without your options (because I believe the only correct way to parse according to the RFC would be $skip_empty_lines = false and $trim_fields = false).

//parse a CSV file into a two-dimensional array
//this seems as simple as splitting a string by lines and commas, but this only works if tricks are performed
//to ensure that you do NOT split on lines and commas that are inside of double quotes.
function parse_csv($str)
{
//match all the non-quoted text and one series of quoted text (or the end of the string)
//each group of matches will be parsed with the callback, with $matches[1] containing all the non-quoted text,
//and $matches[3] containing everything inside the quotes
$str = preg_replace_callback('/([^"]*)("((""|[^"])*)"|$)/s', 'parse_csv_quotes', $str);

//remove the very last newline to prevent a 0-field array for the last line
$str = preg_replace('/\n$/', '', $str);

//split on LF and parse each line with a callback
return array_map('parse_csv_line', explode("\n", $str));
}

//replace all the csv-special characters inside double quotes with markers using an escape sequence
function parse_csv_quotes($matches)
{
//anything inside the quotes that might be used to split the string into lines and fields later,
//needs to be quoted. The only character we can guarantee as safe to use, because it will never appear in the unquoted text, is a CR
//So we're going to use CR as a marker to make escape sequences for CR, LF, Quotes, and Commas.
$str = str_replace("\r", "\rR", $matches[3]);
$str = str_replace("\n", "\rN", $str);
$str = str_replace('""', "\rQ", $str);
$str = str_replace(',', "\rC", $str);

//The unquoted text is where commas and newlines are allowed, and where the splits will happen
//We're going to remove all CRs from the unquoted text, by normalizing all line endings to just LF
//This ensures us that the only place CR is used, is as the escape sequences for quoted text
return preg_replace('/\r\n?/', "\n", $matches[1]) . $str;
}

//split on comma and parse each field with a callback
function parse_csv_line($line)
{
return array_map('parse_csv_field', explode(',', $line));
}

//restore any csv-special characters that are part of the data
function parse_csv_field($field) {
$field = str_replace("\rC", ',', $field);
$field = str_replace("\rQ", '"', $field);
$field = str_replace("\rN", "\n", $field);
$field = str_replace("\rR", "\r", $field);
return $field;
}
up
3
V.Krishn
11 years ago
<?php
Note
: The function trims all values unlike str_getcsv (v5.3).
/**
* @link https://github.com/insteps/phputils (for updated code)
* Parse a CSV string into an array for php 4+.
* @param string $input String
* @param string $delimiter String
* @param string $enclosure String
* @return array
*/
function str_getcsv4($input, $delimiter = ',', $enclosure = '"') {

if( !
preg_match("/[$enclosure]/", $input) ) {
return (array)
preg_replace(array("/^\\s*/", "/\\s*$/"), '', explode($delimiter, $input));
}

$token = "##"; $token2 = "::";
//alternate tokens "\034\034", "\035\035", "%%";
$t1 = preg_replace(array("/\\\[$enclosure]/", "/$enclosure{2}/",
"/[$enclosure]\\s*[$delimiter]\\s*[$enclosure]\\s*/", "/\\s*[$enclosure]\\s*/"),
array(
$token2, $token2, $token, $token), trim(trim(trim($input), $enclosure)));

$a = explode($token, $t1);
foreach(
$a as $k=>$v) {
if (
preg_match("/^{$delimiter}/", $v) || preg_match("/{$delimiter}$/", $v) ) {
$a[$k] = trim($v, $delimiter); $a[$k] = preg_replace("/$delimiter/", "$token", $a[$k]); }
}
$a = explode($token, implode($token, $a));
return (array)
preg_replace(array("/^\\s/", "/\\s$/", "/$token2/"), array('', '', $enclosure), $a);

}

if ( !
function_exists('str_getcsv')) {
function
str_getcsv($input, $delimiter = ',', $enclosure = '"') {
return
str_getcsv4($input, $delimiter, $enclosure);
}
}
?>
up
7
Jeremy
15 years ago
After using several methods in the past to create CSV strings without using files (disk IO sucks), I finally decided it's time to write a function to handle it all. This function could use some cleanup, and the variable type test might be overkill for what is needed, I haven't thought about it too much.

Also, I took the liberty of replacing fields with certain data types with strings which I find much easier to work with. Some of you may not agree with those. Also, please note that the type "double" or float has been coded specifically for two digit precision because if I am using a float, it's most likely for currency.

I am sure some of you out there would appreciate this function.

<?php
function str_putcsv($array, $delimiter = ',', $enclosure = '"', $terminator = "\n") {
# First convert associative array to numeric indexed array
foreach ($array as $key => $value) $workArray[] = $value;

$returnString = ''; # Initialize return string
$arraySize = count($workArray); # Get size of array

for ($i=0; $i<$arraySize; $i++) {
# Nested array, process nest item
if (is_array($workArray[$i])) {
$returnString .= str_putcsv($workArray[$i], $delimiter, $enclosure, $terminator);
} else {
switch (
gettype($workArray[$i])) {
# Manually set some strings
case "NULL": $_spFormat = ''; break;
case
"boolean": $_spFormat = ($workArray[$i] == true) ? 'true': 'false'; break;
# Make sure sprintf has a good datatype to work with
case "integer": $_spFormat = '%i'; break;
case
"double": $_spFormat = '%0.2f'; break;
case
"string": $_spFormat = '%s'; break;
# Unknown or invalid items for a csv - note: the datatype of array is already handled above, assuming the data is nested
case "object":
case
"resource":
default:
$_spFormat = ''; break;
}
$returnString .= sprintf('%2$s'.$_spFormat.'%2$s', $workArray[$i], $enclosure);
$returnString .= ($i < ($arraySize-1)) ? $delimiter : $terminator;
}
}
# Done the workload, return the output information
return $returnString;
}

?>
up
3
keananda at gmail dot com
16 years ago
For those who need this function but not yet installed in their environment, you can use my function bellow.

You can parse your csv file into an associative array (by default) for each lines, or into an object.
<?php
function parse_csv($file, $options = null) {
$delimiter = empty($options['delimiter']) ? "," : $options['delimiter'];
$to_object = empty($options['to_object']) ? false : true;
$str = file_get_contents($file);
$lines = explode("\n", $str);
pr($lines);
$field_names = explode($delimiter, array_shift($lines));
foreach (
$lines as $line) {
// Skip the empty line
if (empty($line)) continue;
$fields = explode($delimiter, $line);
$_res = $to_object ? new stdClass : array();
foreach (
$field_names as $key => $f) {
if (
$to_object) {
$_res->{$f} = $fields[$key];
} else {
$_res[$f] = $fields[$key];
}
}
$res[] = $_res;
}
return
$res;
}
?>

NOTE:
Line number 1 of the csv file will be considered as header (field names).

TODO:
- Enclosure handling
- Escape character handling
- Other features/enhancements as you need

EXAMPLE USE:
Content of /path/to/file.csv:
CODE,COUNTRY
AD,Andorra
AE,United Arab Emirates
AF,Afghanistan
AG,Antigua and Barbuda

<?php
$arr_csv
= parse_csv("/path/to/file.csv");
print_r($arr_csv);
?>
// Output:
Array
(
[0] => Array
(
[CODE] => AD
[COUNTRY] => Andorra
)
[1] => Array
(
[CODE] => AE
[COUNTRY] => United Arab Emirates
)
[2] => Array
(
[CODE] => AF
[COUNTRY] => Afghanistan
)
[3] => Array
(
[CODE] => AG
[COUNTRY] => Antigua and Barbuda
)
)

<?php
$obj_csv
= parse_csv("/path/to/file.csv", array("to_object" => true));
print_r($obj_csv);
?>
// Output:
Array
(
[0] => stdClass Object
(
[CODE] => AD
[COUNTRY] => Andorra
)
[1] => stdClass Object
(
[CODE] => AE
[COUNTRY] => United Arab Emirates
)
[2] => stdClass Object
(
[CODE] => AF
[COUNTRY] => Afghanistan
)
[3] => stdClass Object
(
[CODE] => AG
[COUNTRY] => Antigua and Barbuda
)
[4] => stdClass Object
(
[CODE] =>
[COUNTRY] =>
)
)

// If you use character | (pipe) as delimiter in your csv file, use:
<?php
$arr_csv
= parse_csv("/path/to/file.csv", array("delimiter"=>"|"));
?>

==NSD==
up
7
hpartidas at deuz dot net
14 years ago
I found myself wanting to parse a CSV and didn't have access to str_getcsv, so I wrote substitute for PHP < 5.3, hope it helps someone out there stuck in the same situation.

<?php
if (!function_exists('str_getcsv')) {
function
str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = '\\', $eol = '\n') {
if (
is_string($input) && !empty($input)) {
$output = array();
$tmp = preg_split("/".$eol."/",$input);
if (
is_array($tmp) && !empty($tmp)) {
while (list(
$line_num, $line) = each($tmp)) {
if (
preg_match("/".$escape.$enclosure."/",$line)) {
while (
$strlen = strlen($line)) {
$pos_delimiter = strpos($line,$delimiter);
$pos_enclosure_start = strpos($line,$enclosure);
if (
is_int($pos_delimiter) && is_int($pos_enclosure_start)
&& (
$pos_enclosure_start < $pos_delimiter)
) {
$enclosed_str = substr($line,1);
$pos_enclosure_end = strpos($enclosed_str,$enclosure);
$enclosed_str = substr($enclosed_str,0,$pos_enclosure_end);
$output[$line_num][] = $enclosed_str;
$offset = $pos_enclosure_end+3;
} else {
if (empty(
$pos_delimiter) && empty($pos_enclosure_start)) {
$output[$line_num][] = substr($line,0);
$offset = strlen($line);
} else {
$output[$line_num][] = substr($line,0,$pos_delimiter);
$offset = (
!empty(
$pos_enclosure_start)
&& (
$pos_enclosure_start < $pos_delimiter)
)
?
$pos_enclosure_start
:$pos_delimiter+1;
}
}
$line = substr($line,$offset);
}
} else {
$line = preg_split("/".$delimiter."/",$line);

/*
* Validating against pesky extra line breaks creating false rows.
*/
if (is_array($line) && !empty($line[0])) {
$output[$line_num] = $line;
}
}
}
return
$output;
} else {
return
false;
}
} else {
return
false;
}
}
}
?>
up
2
Xkang
8 years ago
how to solve the UTF-8 BOM's problem
如何处理UTF-8编码的CSV文件中的BOM问题
$bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ); //define bom
$f = file_get_contents('a.csv'); //open the CSV file
#$csv = str_getcsv($f); //it will have bom 这样会出现bom的问题
$csv = str_getcsv(str_replace($bom,'',$f)); //replace the bom 替换掉bom
var_dump($csv); //dump 输出
up
1
Anonymous
15 years ago
For some reason o'connor's code only reads one line of a csv for me... I had to replace the line

$data = fgetcsv($fp, 1000, $delimiter, $enclosure); // $escape only got added in 5.3.0

with this:

$data;
while (!feof($fp))
{
$data[] = fgetcsv($fp, 0, $delimiter, $enclosure); // $escape only got added in 5.3.0
}

...to get all of the data out of my string (some post data pasted into a textbox and processed only with stripslashes).
up
1
khelibert at gmail dot com
12 years ago
I've written this to handle :
- fields with or without enclosure;
- escape and enclosure characters using the same character (ie <<">> in Excel)

<?php
/**
* Converts a csv file into an array of lines and columns.
* khelibert@gmail.com
* @param $fileContent String
* @param string $escape String
* @param string $enclosure String
* @param string $delimiter String
* @return array
*/
function csvToArray($fileContent,$escape = '\\', $enclosure = '"', $delimiter = ';')
{
$lines = array();
$fields = array();

if(
$escape == $enclosure)
{
$escape = '\\';
$fileContent = str_replace(array('\\',$enclosure.$enclosure,"\r\n","\r"),
array(
'\\\\',$escape.$enclosure,"\\n","\\n"),$fileContent);
}
else
$fileContent = str_replace(array("\r\n","\r"),array("\\n","\\n"),$fileContent);

$nb = strlen($fileContent);
$field = '';
$inEnclosure = false;
$previous = '';

for(
$i = 0;$i<$nb; $i++)
{
$c = $fileContent[$i];
if(
$c === $enclosure)
{
if(
$previous !== $escape)
$inEnclosure ^= true;
else
$field .= $enclosure;
}
else if(
$c === $escape)
{
$next = $fileContent[$i+1];
if(
$next != $enclosure && $next != $escape)
$field .= $escape;
}
else if(
$c === $delimiter)
{
if(
$inEnclosure)
$field .= $delimiter;
else
{
//end of the field
$fields[] = $field;
$field = '';
}
}
else if(
$c === "\n")
{
$fields[] = $field;
$field = '';
$lines[] = $fields;
$fields = array();
}
else
$field .= $c;
$previous = $c;
}
//we add the last element
if(true || $field !== '')
{
$fields[] = $field;
$lines[] = $fields;
}
return
$lines;
}
?>
up
1
peter dot mlich at volny dot cz
9 years ago
> 49 durik at 3ilab dot net / 4 years ago
$rows = str_getcsv($csv_data, "\n");
- bug, data in csv can have "\n"
'aaa','bb
b','ccc'
up
0
nicolasbonnici at gmail dot com
1 month ago
Better way to get an associative key value array output.

$output = [];
$lines = array_values(explode(PHP_EOL, $your_csv_string));
$headers = str_getcsv(array_shift($lines));
foreach ($lines as $line) {
$parsedLine = str_getcsv($line);
if (count($headers) !== count($parsedLine)) {
continue;
}
$output[] = array_combine($headers, $parsedLine);
}
var_dump($output);
up
2
hans at loltek dot net
1 year ago
old MacOS (up to ~2001) and old Office For MacOS (up to 2007? I think) use carriage-return for newlines,
Microsoft Windows use carriage-return+line-feed for newlines,
Unix (Linux and modern MacOS) use line-feeds,
Some systems use BOM/byte-order-masks just to say they use UTF-8, i've even encountered one-BOM-per-CSV-row!

For a csv-file parser handling all the above cases, I wrote:

<?php
function parse_csv(string $csv, string $separator = ","): array
{
$csv = strtr(
$csv,
[
"\xEF\xBB\xBF" => "", // remove UTF-8 byte order masks, if present
"\r\n" => "\n", // Windows CrLf=> Unix Lf
"\r" => "\n" // old-MacOS Cr => Unix Lf
// (both modern MacOS and Linux use Lf .. Windows is the only outlier)
]
);
$lines = explode("\n", $csv);
$keys = str_getcsv(array_shift($lines), $separator);
$ret = array();
foreach (
$lines as $lineno => $line) {
if (
strlen($line) < 1) {
// ... probably malformed csv, but we'll allow it
continue;
}
$parsed = str_getcsv($line, $separator);
if (
count($parsed) !== count($keys)) {
throw new
\RuntimeException("error on csv line #{$lineno}: count mismatch:" . count($parsed) . ' !== ' . count($keys) . ": " . var_export([
'error' => 'count mismatch',
'keys' => $keys,
'parsed' => $parsed,
'line' => $line
], true));
}
$ret[] = array_combine($keys, $parsed);
}
return
$ret;
}
?>
up
1
manngo
1 year ago
I can’t see this mentioned in the description, but it appears that the fields will be trimmed slightly of trailing line breaks.

In the following example:

<?php
$string
= "\nPHP\r\n,Java\nScript\r\n\r\n,Fortran\n,Cobol\n\n,\nSwift\r\n\r\n\r\n";
$data = str_getcsv($string);
foreach(
$data as $d) print "[$d]";

/* Result:
================================================
[
PHP][Java
Script
][Fortran][Cobol
][
Swift
]
================================================ */
?>

You’ll see:

- a leading line break is retained; a line break in the rest of the field is also retained
- one trailing line break is removed; any more are retained
- a line break at the end of the string is also removed; this means that two trailing line breaks at the end are removed
- a line break can be a unix/macos line break (\n) or a windows line beak (\r\n)

Tested on my Macintosh, so I’m not sure how universal this is.

Among other things, it means you can read the file with the file() function without having to include the FILE_IGNORE_NEW_LINES flag.
up
0
php at richardneill dot org
1 year ago
For maximum compatibility with standard (RFC-4180) CSV files, remember that the proprietary-escape mechanism should be disabled. i.e. set the optional 5th parameter to "" (the empty string).
up
0
Wade Rossmann
2 years ago
For completeness, here is a userspace str_putcsv() that is fully compatible with fgetcsv() and fputcsv()'s arguments. Namely $escape and $eol, which all others seem to be omitting.

<?php

function str_putcsv(
array
$fields,
string $separator = ",",
string $enclosure = "\"",
string $escape = "\\",
string $eol = "\n"
) {
return
implode($separator,
array_map(
function(
$a)use($enclosure, $escape) {
$type = gettype($a);
switch(
$type) {
case
'integer': return sprintf('%d', $a);
case
'double': return rtrim(sprintf('%0.'.ini_get('precision').'f', $a), '0');
case
'boolean': return ( $a ? 'true' : 'false' );
case
'NULL': return '';
case
'string':
return
sprintf('"%s"', str_replace(
[
$escape, $enclosure],
[
$escape.$escape, $escape.$enclosure],
$a
));
default: throw new
TypeError("Cannot stringify type: $type");
}
},
$fields
)
) .
$eol;
}
up
2
ivijan dot stefan at gmail dot com
3 years ago
Imagine a situation where you need a function that works with both URL and comma delimited text.

This is exactly the function that works like that using str_getcsv(). Just simply insert a CSV URL or comma separated text and it work nicely.

<?php
function parse_csv( $filename_or_text, $delimiter=',', $enclosure='"', $linebreak="\n" )
{
$return = array();

if(
false !== ($csv = (filter_var($filename_or_text, FILTER_VALIDATE_URL) ? file_get_contents($filename_or_text) : $filename_or_text)))
{
$csv = trim($csv);
$csv = mb_convert_encoding($csv, 'UTF-16LE');

foreach(
str_getcsv($csv, $linebreak, $enclosure) as $row){
$col = str_getcsv($row, $delimiter, $enclosure);
$col = array_map('trim', $col);
$return[] = $col;
}
}
else
{
throw new
\Exception('Can not open the file.');
$return = false;
}

return
$return;
}
?>
up
0
txxllm at hotmail dot com
3 years ago
Sometimes the enclosure parameter of the str_getcsv function doesn't work, so I wrote a function that is equivalent to the function

<?php
/**
* @param string $input
* @param string $delimiter
* @param string $enclosure
* @param string $escape
* @return array
* @author TXX
* @date 2021/1/25 15:03
*/
function my_str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = '\\') {
$output = array();

if (empty(
$input) || !is_string($input)) {
return
$output;
}

if (
preg_match("/". $escape . $enclosure ."/", $input)) {
while (
$strlen = strlen($input)) {
$pos_delimiter = strpos($input, $delimiter); //分隔符出现位置
$pos_enclosure_start = strpos($input, $enclosure); //封闭符-开始出现位置

//有封闭符并封闭符在分隔符之前
if (is_int($pos_delimiter) && is_int($pos_enclosure_start) && $pos_enclosure_start < $pos_delimiter) {
$pos_enclosure_start += 1;
$enclosed_str = substr($input, $pos_enclosure_start); //封闭字符串-开始
$pos_enclosure_end = strpos($enclosed_str, $enclosure); //封闭符-结尾封闭字符串-开始中出现位置
$pos_enclosure_end += $pos_enclosure_start; //封闭符-结尾在原始数据中出现位置

if ($pos_enclosure_end < $pos_delimiter) {
//封闭符-结束在分隔符之前,无需进行封闭
$output[] = substr($input, 0, $pos_delimiter);
$offset = $pos_delimiter + 1;
} else {
//封闭符-结束在分隔符之后,需要封闭
$pos_enclosure_end += 1;
$before_enclosed_str = substr($input, 0, $pos_enclosure_end);
$enclosed_str = substr($input, $pos_enclosure_end); //封闭字符串之后的字符串

$enclosed_arr = my_str_getcsv($enclosed_str, $delimiter, $enclosure); //将封闭之后的字符串执行自身
$enclosed_arr[0] = $before_enclosed_str . $enclosed_arr[0];

$output = array_merge($output, $enclosed_arr);
$offset = strlen($input); //光标移至结尾
}
} else {
//无封闭
if (!is_int($pos_delimiter)) {
//无分隔符,直接将字符串加入输出数组
$output[] = $input;
//光标移至结尾
$offset = strlen($input);
} else if (
$input == $delimiter) {
//如果字符串只剩下分隔符,需保存'',''
$output = array_merge($output, ['','']);
$offset = $pos_delimiter+1; //光标移至分隔符后一位
} else {
$output[] = substr($input, 0, $pos_delimiter); //将分割符之前的数据
$offset = $pos_delimiter+1; //光标移至分隔符后一位
}
}
//将字符串更新至光标位置
$input = substr($input,$offset);
}
} else {
//字符串中不存在封闭符,直接通过分隔符分割
$input = preg_split("/". $escape . $delimiter ."/", $input);

if (
is_array($input)) {
$output = $input;
}
}

return
$output;
}

?>
up
0
Anonymous
4 years ago
Note that the function does NOT remove the escaping characters. If you do

<?php
str_getcsv
('"abc\"abc"')
?>

you'll get an array with a string(8) "abc\"abc", the \ will stay.
up
0
pasmanik at gmail dot com
9 years ago
I prepared some better function for parsing CSV string.

function csv_to_array($string='', $row_delimiter=PHP_EOL, $delimiter = "," , $enclosure = '"' , $escape = "\\" )
{
$rows = array_filter(explode($row_delimiter, $string));
$header = NULL;
$data = array();

foreach($rows as $row)
{
$row = str_getcsv ($row, $delimiter, $enclosure , $escape);

if(!$header)
$header = $row;
else
$data[] = array_combine($header, $row);
}

return $data;
}
up
0
V.Krishn
11 years ago
Note: The function trims all values unlike str_getcsv (v5.3).
/**
* @link https://github.com/insteps/phputils (for updated code)
* Parse a CSV string into an array for php 4+.
* @param string $input String
* @param string $delimiter String
* @param string $enclosure String
* @return array
*/
function str_getcsv4($input, $delimiter = ',', $enclosure = '"') {

if( ! preg_match("/[$enclosure]/", $input) ) {
return (array)preg_replace(array("/^\\s*/", "/\\s*$/"), '', explode($delimiter, $input));
}

$token = "##"; $token2 = "::";
//alternate tokens "\034\034", "\035\035", "%%";
$t1 = preg_replace(array("/\\\[$enclosure]/", "/$enclosure{2}/",
"/[$enclosure]\\s*[$delimiter]\\s*[$enclosure]\\s*/", "/\\s*[$enclosure]\\s*/"),
array($token2, $token2, $token, $token), trim(trim(trim($input), $enclosure)));

$a = explode($token, $t1);
foreach($a as $k=>$v) {
if ( preg_match("/^{$delimiter}/", $v) || preg_match("/{$delimiter}$/", $v) ) {
$a[$k] = trim($v, $delimiter); $a[$k] = preg_replace("/$delimiter/", "$token", $a[$k]); }
}
$a = explode($token, implode($token, $a));
return (array)preg_replace(array("/^\\s/", "/\\s$/", "/$token2/"), array('', '', $enclosure), $a);

}

if ( ! function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ',', $enclosure = '"') {
return str_getcsv4($input, $delimiter, $enclosure);
}
}
up
0
xoneca at gmail dot com
13 years ago
Note that this function can also be used to parse other types of constructions. For example, I have used to parse .htaccess AddDescription lines:

AddDescription "My description to the file." filename.jpg

Those lines can be parsed like this:

<?php

$line
= 'AddDescription "My description to the file." filename.jpg';

$parsed = str_getcsv(
$line, # Input line
' ', # Delimiter
'"', # Enclosure
'\\' # Escape char
);

var_dump( $parsed );

?>

The output:

array(3) {
[0]=>
string(14) "AddDescription"
[1]=>
string(27) "My description to the file."
[2]=>
string(12) "filename.jpg"
}
up
0
dave_walter at NOSPAM dot yahoo dot com
15 years ago
Drawing inspiration from daniel dot oconnor at gmail dot com, here's an alternative str_putcsv() that leverages existing PHP core functionality (5.1.0+) to avoid re-inventing the wheel.

<?php
if(!function_exists('str_putcsv')) {
function
str_putcsv($input, $delimiter = ',', $enclosure = '"') {
// Open a memory "file" for read/write...
$fp = fopen('php://temp', 'r+');
// ... write the $input array to the "file" using fputcsv()...
fputcsv($fp, $input, $delimiter, $enclosure);
// ... rewind the "file" so we can read what we just wrote...
rewind($fp);
// ... read the entire line into a variable...
$data = fgets($fp);
// ... close the "file"...
fclose($fp);
// ... and return the $data to the caller, with the trailing newline from fgets() removed.
return rtrim( $data, "\n" );
}
}
?>
up
0
william dot j dot weir at gmail dot com
16 years ago
If your happy enough having just a multi-dimensional array, this should work fine. I had wanted to use the one provided by keananda but it was choking on pr($lines).

<?php
function f_parse_csv($file, $longest, $delimiter) {
$mdarray = array();
$file = fopen($file, "r");
while (
$line = fgetcsv($file, $longest, $delimiter)) {
array_push($mdarray, $line);
}
fclose($file);
return
$mdarray;
}
?>

$longest is a number that represents the longest line in the csv file as required by fgetcsv(). The page for fgetcsv() said that the longest line could be set to 0 or left out, but I couldn't get it to work without. I just made it extra large when I had to use it.
To Top