PYTHON mb_detect_encoding

is this article helpful?
|
Python replacement for PHP's mb_detect_encoding [ edit | history ]
This is from http://bytes.com/topic/python/answers/431749-detect-character-encoding#post1635105

def mb_detect_encoding(text, encoding_list=['ascii']):
    '''Return first matched encoding in encoding_list, otherwise return None.
    See [url]http://docs.python.org/2/howto/unicode.html#the-unicode-type[/url] for more info.
    See [url]http://docs.python.org/2/library/codecs.html#standard-encodings[/url] for encodings.'''
    for best_enc in encoding_list:
        try:
            unicode(text, best_enc)
        except:
            best_enc = None
        else:
            break
    return best_enc

PHP mb_detect_encoding

PHP original manual for mb_detect_encoding [ show | php.net ]

mb_detect_encoding

(PHP 4 >= 4.0.6, PHP 5)

mb_detect_encodingDetect character encoding

Description

string mb_detect_encoding ( string $str [, mixed $encoding_list= mb_detect_order() [, bool $strict= false ]] )

Detects character encoding in string str .

Parameters

str

The string being detected.

encoding_list

encoding_list is list of character encoding. Encoding order may be specified by array or comma separated list string.

If encoding_list is omitted, detect_order is used.

strict

strict specifies whether to use the strict encoding detection or not. Default is FALSE.

Return Values

The detected character encoding.

Examples

Example #1 mb_detect_encoding() example

<?php
/* Detect character encoding with current detect_order */
echo mb_detect_encoding($str);

/* "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
echo mb_detect_encoding($str"auto");

/* Specify encoding_list character encoding by comma separated list */
echo mb_detect_encoding($str"JIS, eucjp-win, sjis-win");

/* Use array to specify encoding_list  */
$ary[] = "ASCII";
$ary[] = "JIS";
$ary[] = "EUC-JP";
echo 
mb_detect_encoding($str$ary);
?>

See Also