Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 595 Vote(s) - 3.45 Average
  • 1
  • 2
  • 3
  • 4
  • 5
TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT maximum storage sizes

#1
Per [the MySQL docs][1], there are four TEXT types:

1. TINYTEXT
2. TEXT
3. MEDIUMTEXT
4. LONGTEXT

What is the maximum length that I can store in a column of each data type assuming the character encoding is UTF-8?


[1]:

[To see links please register here]

Reply

#2
Rising to @Ankan-Zerob's challenge, this is my estimate of the maximum length which can be stored in each text type **measured in words**:

Type | Bytes | English words | Multi-byte words
-----------+---------------+---------------+-----------------
TINYTEXT | 255 | ±44 | ±23
TEXT | 65,535 | ±11,000 | ±5,900
MEDIUMTEXT | 16,777,215 | ±2,800,000 | ±1,500,000
LONGTEXT | 4,294,967,295 | ±740,000,000 | ±380,000,000

In **English**, 4.8 letters per word is probably a good average (eg [norvig.com/mayzner.html](

[To see links please register here]

)), though word lengths will vary according to domain (e.g. spoken language vs. academic papers), so there's no point being too precise. English is mostly single-byte ASCII characters, with very occasional multi-byte characters, so close to one-byte-per-letter. An extra character has to be allowed for inter-word spaces, so I've rounded down from 5.8 bytes per word. Languages with lots of accents such as say Polish would store slightly fewer words, as would e.g. German with longer words.

Languages requiring **multi-byte** characters such as Greek, Arabic, Hebrew, Hindi, Thai, etc, etc typically require two bytes per character in UTF-8. Guessing wildly at 5 letters per word, I've rounded down from 11 bytes per word.

CJK scripts (Hanzi, Kanji, Hiragana, Katakana, etc) I know nothing of; I believe characters mostly require 3 bytes in UTF-8, and (with massive simplification) they might be considered to use around 2 characters per word, so they would be somewhere between the other two. (CJK scripts are likely to require less storage using UTF-16, depending).

This is of course ignoring storage overheads etc.
Reply

#3
This is nice but doesn't answer the question:

"A VARCHAR should always be used instead of TINYTEXT." Tinytext is useful if you have wide rows - since the data is stored off the record. There is a performance overhead, but it does have a use.
Reply

#4
Expansion of the same answer

1. This [SO post](

[To see links please register here]

) outlines in detail the overheads and storage mechanisms.
2. As noted from point (1), A VARCHAR should always be used instead of TINYTEXT. However, when using VARCHAR, the max rowsize should not exceeed 65535 bytes.
3. As outlined here

[To see links please register here]

, max 3 bytes for utf-8.

THIS IS A ROUGH ESTIMATION TABLE FOR QUICK DECISIONS!<br />

1. So the worst case assumptions (3 bytes per utf-8 char) to best case (1 byte per utf-8 char)
2. Assuming the english language has an average of 4.5 letters per word
3. x is the number of bytes allocated

x-x

Type | A= worst case (x/3) | B = best case (x) | words estimate (A/4.5) - (B/4.5)
-----------+---------------------------------------------------------------------------
TINYTEXT | 85 | 255 | 18 - 56
TEXT | 21,845 | 65,535 | 4,854.44 - 14,563.33
MEDIUMTEXT | 5,592,415 | 16,777,215 | 1,242,758.8 - 3,728,270
LONGTEXT | 1,431,655,765 | 4,294,967,295 | 318,145,725.5 - 954,437,176.6

Please refer to Chris V's answer as well :

[To see links please register here]

Reply

#5
From the [documentation (MySQL 8)](

[To see links please register here]

) :

<pre>
Type | Maximum length
-----------+-------------------------------------
TINYTEXT | 255 (2<sup> 8</sup>−1) bytes
TEXT | 65,535 (2<sup>16</sup>−1) bytes = 64 KiB
MEDIUMTEXT | 16,777,215 (2<sup>24</sup>−1) bytes = 16 MiB
LONGTEXT | 4,294,967,295 (2<sup>32</sup>−1) bytes = 4 GiB
</pre>

Note that the number of *characters* that can be stored in your column will depend on the *character encoding*.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through