-
I want to perform search on a
LONGBLOBfield containing Arabic text data. For example, how do U search "هذه «الأولويات الدواوينية» ف" ?The table field has values like
3313537353B2623313630363B2623313631303B202623313630343B2623313537353B2623313630363B202623313539303B2623313538313B2623313537353B2623313631303B2623313537353B2623313630373B2623313630353B2026; however, if I retrieve the Arabic text value and display on a web page, it shows proper Arabic characters. -
If I change the data type of the field from
LONGBLOBtoLONGTEXT, will it affect the Arabic content text I have stored? I have almost 1500 records in that table.
I want to perform search on a LONGBLOB field containing Arabic text data. For
Share
It’s important to understand the difference between a character and its encoding. The character
ن, for example would be stored with very different bytes depending on its encoding. For example, it would be represented by the single byte0xccif encoded with the IBM1097 codepage, but the four byte sequence0xfefffee5if encoded with UTF-16. Worse, sometimes the same character can be represented in multiple ways within the same encoding.Unless MySQL knows which encoding was used, it won’t able to perform textual comparisons of the sort you need (whilst it can perform binary ones to search for the same byte sequences, this won’t apply your desired collation—i.e. how strings are compared, e.g. case insensitivity, or different bytes sequences representing the same characters).
Therefore you must either provide the encoding information to MySQL when you perform your search, or have MySQL keep track of it from the moment it first receives the data (i.e. by storing the data in a string-type column rather than a binary-type one).
It is much more usual (and indeed I strongly advise you) to store text data in string-type columns.
LONGTEXTis one possibility, but might be overkill for your needs: it can store up to 4GiB of data! PerhapsTEXTorVARCHAR(which can both hold up to 64KiB) orMEDIUMTEXT(up to 16MiB) would be more appropriate?Once understood to be character data, MySQL can simply search for text using its String Comparison Functions or Regular Expressions. For example:
This would search
mytablefor any record whosetextcolumnfield contains (according to its collation) the specified string anywhere within it.You must first understand with what encoding your existing data has been stored in the
LONGBLOBcolumn (which will be whatever encoding the originating client used when it inserted/updated the data).You can then convert it to a string-type column without problem—although note that if it differed between records, you will have to manage the conversion of each record on a case-by-case basis (but you would also face the same issue when retrieving the current data anyway). For example, if the data is encoded using UTF-8, you can convert the column to
TEXTas follows:Note that you must ensure your connection character set is correctly configured for your client in order to ensure that any necessary conversions occur when sending/retrieving string data.