久久久久久久视色,久久电影免费精品,中文亚洲欧美乱码在线观看,在线免费播放AV片

<center id="vfaef"><input id="vfaef"><table id="vfaef"></table></input></center>

<p id="vfaef"><kbd id="vfaef"></kbd></p>

<pre id="vfaef"><u id="vfaef"></u></pre>

<thead id="vfaef"><input id="vfaef"></input></thead>

當(dāng)前位置：站長資訊網(wǎng) > 編程知識 > 正文

php怎么刪除非utf8字符

2021-06-12 分類：編程知識閱讀(914) 評論(0)

php刪除非utf8字符的方法：首先創(chuàng)建一個PHP示例文件；然后使用正則表達(dá)式“preg_replace($regex, '$1', $text);”方法刪除非utf8字符即可。

php怎么刪除非utf8字符

本文操作環(huán)境：windows7系統(tǒng)、PHP7.1版，DELL G3電腦

具體問題：

php怎么刪除非utf8字符？

php 從字符串中刪除非UTF8字符

我在從字符串中刪除非utf8字符時出現(xiàn)問題，這些字符無法正確顯示。像這樣的字符0x97 0x61 0x6C 0x6F(十六進(jìn)制表示)

刪除它們的最佳方法是什么？正則表達(dá)式還是其他？

解決辦法：

使用正則表達(dá)式方法:

$regex = <<<'END' /   (     (?: [x00-x7F]                 # single-byte sequences   0xxxxxxx     |   [xC0-xDF][x80-xBF]      # double-byte sequences   110xxxxx 10xxxxxx     |   [xE0-xEF][x80-xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2     |   [xF0-xF7][x80-xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3      ){1,100}                        # ...one or more times   ) | .                                 # anything else /x END; preg_replace($regex, '$1', $text);

它搜索UTF-8序列，并將其捕獲到組1中。它還匹配無法識別為UTF-8序列的一部分的單個字節(jié)，但不捕獲這些字節(jié)。替換是捕獲到組1中的任何內(nèi)容。這將有效刪除所有無效字節(jié)。

通過將無效字節(jié)編碼為UTF-8字符，可以修復(fù)字符串。但是，如果錯誤是隨機的，則可能會留下一些奇怪的符號。

$regex = <<<'END' /   (     (?: [x00-x7F]               # single-byte sequences   0xxxxxxx     |   [xC0-xDF][x80-xBF]    # double-byte sequences   110xxxxx 10xxxxxx     |   [xE0-xEF][x80-xBF]{2} # triple-byte sequences   1110xxxx 10xxxxxx * 2     |   [xF0-xF7][x80-xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3      ){1,100}                      # ...one or more times   ) | ( [x80-xBF] )                 # invalid byte in range 10000000 - 10111111 | ( [xC0-xFF] )                 # invalid byte in range 11000000 - 11111111 /x END; function utf8replacer($captures) {   if ($captures[1] != "") {     // Valid byte sequence. Return unmodified.     return $captures[1];   }   elseif ($captures[2] != "") {     // Invalid byte of the form 10xxxxxx.     // Encode as 11000010 10xxxxxx.     return "xC2".$captures[2];   }   else {     // Invalid byte of the form 11xxxxxx.     // Encode as 11000011 10xxxxxx.     return "xC3".chr(ord($captures[3])-64);   } } preg_replace_callback($regex, "utf8replacer", $text);

編輯:

!empty(x)將匹配非空值("0"被認(rèn)為是空的)。
x != ""將匹配非空值，包括"0"。
x !== ""將匹配""以外的任何內(nèi)容。

在這種情況下，x != ""似乎是最好的選擇。

我也加快了比賽速度。而不是單獨匹配每個字符，它匹配有效的UTF-8字符序列。

推薦學(xué)習(xí)：《PHP視頻教程》

贊(0)

標(biāo)簽：apt list php UTF8 Windows7 Windows7系統(tǒng)正則表達(dá)式電腦

相關(guān)推薦

網(wǎng)站地圖滬ICP備18035694號-2

滬公網(wǎng)安備31011702889846號