str_word_count

(PHP 4 >= 4.3.0, PHP 5, PHP 7, PHP 8)

str_word_count — Bir dizgedeki sözcükler hakkında bilgi verir

Açıklama

function str_word_count(string $dizge, int $biçem = 0, ?string $karakterler = null): array|int

dizge içindeki sözcükleri sayar. Seçimlik olan biçem bağımsız değişkeni belirtilMEmişse, dönüş değeri bulunan sözcüklerin sayısını ifade eden bir tamsayı olur. Belirtilmesi durumunda içeriği belirtilen biçem bağımsız değişkenine bağlı olarak değişen bir dizi döner. biçem bağımsız değişkeninde belirtilebilecek değerler ve sonuçları aşağıda açıklanmıştır.

Bu işlevin amacı doğrultusunda 'sözcük' yerele bağlı abecesel karakterlerden başka, sözcüğün ilk karakteri dışında "'" ve "-" karakterlerini de içerebilir. Dikkat: Çok baytlı yereller desteklenmez.

Bağımsız Değişkenler

dizge

Sözcükleri hakkında bilgi döndürülecek dizge.

biçem

Bu işlevin ne döndüreceği belirtilir. Desteklenen değerler:

0 - Bulunan sözcük sayısı döner.
1 - dizge içindeki tüm sözcükleri içeren bir dizi döner.
2 - Sözcüklerin dizge içindeki konumlarını anahtar, sözcükleri değer olarak içeren bir ilişkisel dizi döner.

karakterler

Bir sözcük karakteri olarak değerlendirilebilecek karakterlerin listesi.

Dönen Değerler

Belirtilen biçem'e göre bir tamsayı veya bir dizi döner.

Sürüm Bilgisi

Sürüm:	Açıklama
8.0.0	`karakterler` artık `null` olabiliyor.

Örnekler

Örnek 1 - str_word_count() örneği

<?php

$str = "Hello fri3nd, you're
       looking          good today!";

print_r(str_word_count($str, 1));
print_r(str_word_count($str, 2));
print_r(str_word_count($str, 1, 'àáãç3'));

echo str_word_count($str);

?>

Yukarıdaki örneğin çıktısı:

Array
(
    [0] => Hello
    [1] => fri
    [2] => nd
    [3] => you're
    [4] => looking
    [5] => good
    [6] => today
)

Array
(
    [0] => Hello
    [6] => fri
    [10] => nd
    [14] => you're
    [29] => looking
    [46] => good
    [51] => today
)

Array
(
    [0] => Hello
    [1] => fri3nd
    [2] => you're
    [3] => looking
    [4] => good
    [5] => today
)

7

Ayrıca Bakınız

explode() - Bir dizgeyi bir ayraca göre bölüp bir dizi haline getirir
preg_split() - Dizgeyi düzenli ifadeye göre böler
count_chars() - Bir dizgedeki karakterler hakkında bilgi döndürür
substr_count() - Bir dizge içinde belli bir alt dizgeden kaç tane bulunduğunu bulur

Found A Problem?

Learn How To Improve This Page • Submit a Pull Request • Report a Bug

＋add a note

User Contributed Notes 11 notes

down

cito at wikatu dot com ¶

14 years ago

<?php

/***
 * This simple utf-8 word count function (it only counts) 
 * is a bit faster then the one with preg_match_all
 * about 10x slower then the built-in str_word_count
 * 
 * If you need the hyphen or other code points as word-characters
 * just put them into the [brackets] like [^\p{L}\p{N}\'\-]
 * If the pattern contains utf-8, utf8_encode() the pattern,
 * as it is expected to be valid utf-8 (using the u modifier).
 **/

// Jonny 5's simple word splitter
function str_word_count_utf8($str) {
  return count(preg_split('~[^\p{L}\p{N}\']+~u',$str));
}
?>

down

splogamurugan at gmail dot com ¶

17 years ago

We can also specify a range of values for charlist.

<?php
$str = "Hello fri3nd, you're
       looking          good today! 
       look1234ing";
print_r(str_word_count($str, 1, '0..3'));
?>

will give the result as 

Array ( [0] => Hello [1] => fri3nd [2] => you're [3] => looking [4] => good [5] => today [6] => look123 [7] => ing )

down

Adeel Khan ¶

18 years ago

<?php

/**
 * Returns the number of words in a string.
 * As far as I have tested, it is very accurate.
 * The string can have HTML in it,
 * but you should do something like this first:
 *
 *    $search = array(
 *      '@<script[^>]*?>.*?</script>@si',
 *      '@<style[^>]*?>.*?</style>@siU',
 *      '@<![\s\S]*?--[ \t\n\r]*>@'
 *    );
 *    $html = preg_replace($search, '', $html);
 *
 */

function word_count($html) {

  # strip all html tags
  $wc = strip_tags($html);

  # remove 'words' that don't consist of alphanumerical characters or punctuation
  $pattern = "#[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]+#";
  $wc = trim(preg_replace($pattern, " ", $wc));

  # remove one-letter 'words' that consist only of punctuation
  $wc = trim(preg_replace("#\s*[(\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]\s*#", " ", $wc));

  # remove superfluous whitespace
  $wc = preg_replace("/\s\s+/", " ", $wc);

  # split string into an array of words
  $wc = explode(" ", $wc);

  # remove empty elements
  $wc = array_filter($wc);

  # return the number of words
  return count($wc);

}

?>

down

manrash at gmail dot com ¶

17 years ago

For spanish speakers a valid character map may be:

<?php
$characterMap = 'áéíóúüñ';

$count = str_word_count($text, 0, $characterMap);
?>

down

uri at speedy dot net ¶

13 years ago

Here is a count words function which supports UTF-8 and Hebrew. I tried other functions but they don't work. Notice that in Hebrew, '"' and '\'' can be used in words, so they are not separators. This function is not perfect, I would prefer a function we are using in JavaScript which considers all characters except [a-zA-Zא-ת0-9_\'\"] as separators, but I don't know how to do it in PHP.

I removed some of the separators which don't work well with Hebrew ("\x20", "\xA0", "\x0A", "\x0D", "\x09", "\x0B", "\x2E"). I also removed the underline.

This is a fix to my previous post on this page - I found out that my function returned an incorrect result for an empty string. I corrected it and I'm also attaching another function - my_strlen.

<?php 

function count_words($string) {
    // Return the number of words in a string.
    $string= str_replace("&#039;", "'", $string);
    $t= array(' ', "\t", '=', '+', '-', '*', '/', '\\', ',', '.', ';', ':', '[', ']', '{', '}', '(', ')', '<', '>', '&', '%', '$', '@', '#', '^', '!', '?', '~'); // separators
    $string= str_replace($t, " ", $string);
    $string= trim(preg_replace("/\s+/", " ", $string));
    $num= 0;
    if (my_strlen($string)>0) {
        $word_array= explode(" ", $string);
        $num= count($word_array);
    }
    return $num;
}

function my_strlen($s) {
    // Return mb_strlen with encoding UTF-8.
    return mb_strlen($s, "UTF-8");
}

?>

down

brettNOSPAM at olwm dot NO_SPAM dot com ¶

23 years ago

This example may not be pretty, but It proves accurate:

<?php
//count words
$words_to_count = strip_tags($body);
$pattern = "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$words_to_count = preg_replace ($pattern, " ", $words_to_count);
$words_to_count = trim($words_to_count);
$total_words = count(explode(" ",$words_to_count));
?>

Hope I didn't miss any punctuation. ;-)

down

php dot net at salagir dot com ¶

8 years ago

This function doesn't handle  accents, even in a locale with accent.
<?php
echo str_word_count("Is working"); // =2

setlocale(LC_ALL, 'fr_FR.utf8');
echo str_word_count("Not wôrking"); // expects 2, got 3.
?>

Cito solution treats punctuation as words and thus isn't a good workaround.
<?php
function str_word_count_utf8($str) {
      return count(preg_split('~[^\p{L}\p{N}\']+~u',$str));
}
echo str_word_count_utf8("Is wôrking"); //=2
echo str_word_count_utf8("Not wôrking."); //=3
?>

My solution:
<?php
function str_word_count_utf8($str) {
    $a = preg_split('/\W+/u', $str, -1, PREG_SPLIT_NO_EMPTY);
    return count($a);
}
echo str_word_count_utf8("Is wôrking"); // = 2
echo str_word_count_utf8("Is wôrking! :)"); // = 2
?>

down

dmVuY2lAc3RyYWhvdG5pLmNvbQ== (base64) ¶

15 years ago

to count words after converting a msword document to plain text with antiword, you can use this function:

<?php
function count_words($text) {
    $text = str_replace(str_split('|'), '', $text); // remove these chars (you can specify more)
    $text = trim(preg_replace('/\s+/', ' ', $text)); // remove extra spaces
    $text = preg_replace('/-{2,}/', '', $text); // remove 2 or more dashes in a row
    $len = strlen($text);
    
    if (0 === $len) {
        return 0;
    }
    
    $words = 1;
    
    while ($len--) {
        if (' ' === $text[$len]) {
            ++$words;
        }
    }
    
    return $words;
}
?>

it strips the pipe "|" chars, which antiword uses to format tables in its plain text output, removes more than one dashes in a row (also used in tables), then counts the words.

counting words using explode() and then count() is not a good idea for huge texts, because it uses much memory to store the text once more as an array. this is why i'm using while() { .. } to walk the string

down

brettz9 - see yahoo ¶

16 years ago

Words also cannot end in a hyphen unless allowed by the charlist...

down

charliefrancis at gmail dot com ¶

17 years ago

Hi this is the first time I have posted on the php manual, I hope some of you will like this little function I wrote.

It returns a string with a certain character limit, but still retaining whole words.
It breaks out of the foreach loop once it has found a string short enough to display, and the character list can be edited.

<?php
function word_limiter( $text, $limit = 30, $chars = '0123456789' ) {
    if( strlen( $text ) > $limit ) {
        $words = str_word_count( $text, 2, $chars );
        $words = array_reverse( $words, TRUE );
        foreach( $words as $length => $word ) {
            if( $length + strlen( $word ) >= $limit ) {
                array_shift( $words );
            } else {
                break;
            }
        }
        $words = array_reverse( $words );
        $text = implode( " ", $words ) . '&hellip;';
    }
    return $text;
}

$str = "Hello this is a list of words that is too long";
echo '1: ' . word_limiter( $str );
$str = "Hello this is a list of words";
echo '2: ' . word_limiter( $str );
?>

1: Hello this is a list of words&hellip;
2: Hello this is a list of words

down

MadCoder ¶

20 years ago

Here's a function that will trim a $string down to a certian number of words, and add a...   on the end of it.
(explansion of muz1's 1st 100 words code)

----------------------------------------------
<?php
function trim_text($text, $count){
$text = str_replace("  ", " ", $text);
$string = explode(" ", $text);
for ( $wordCounter = 0; $wordCounter <= $count;wordCounter++ ){ 
$trimed .= $string[$wordCounter];
if ( $wordCounter < $count ){ $trimed .= " "; }
else { $trimed .= "..."; }
}
$trimed = trim($trimed);
return $trimed;
}
?>

Usage
------------------------------------------------
<?php
$string = "one two three four";
echo trim_text($string, 3);
?>

returns:
one two three...

＋add a note