Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 734 Vote(s) - 3.36 Average
  • 1
  • 2
  • 3
  • 4
  • 5
MySQL: Large VARCHAR vs. TEXT?

#1
I've got a messages table in MySQL which records messages between users. Apart from the typical ids and message types (all integer types) I need to save the actual message text as either VARCHAR or TEXT. I'm setting a front-end limit of 3000 characters which means the messages would never be inserted into the db as longer than this.

Is there a rationale for going with either VARCHAR(3000) or TEXT? There's something about just writing VARCHAR(3000) that feels somewhat counter-intuitive. I've been through other similar posts on Stack Overflow but would be good to get views specific to this type of common message storing.
Reply

#2
Disclaimer: I'm not a MySQL expert ... but this is my understanding of the issues.

I think TEXT is stored outside the mysql row, while I think VARCHAR is stored as part of the row. There is a maximum row length for mysql rows .. so you can limit how much other data you can store in a row by using the VARCHAR.

Also due to VARCHAR forming part of the row, I suspect that queries looking at that field will be slightly faster than those using a TEXT chunk.
Reply

#3
Just to clarify the best practice:

1. Text format messages should almost always be stored as TEXT (they end up being arbitrarily long)

2. String attributes should be stored as VARCHAR (the destination user name, the subject, etc...).

I understand that you've got a front end limit, which is great until it isn't. *grin* The trick is to think of the DB as separate from the applications that connect to it. Just because one application puts a limit on the data, doesn't mean that the data is intrinsically limited.

What is it about the messages themselves that forces them to never be more then 3000 characters? If it's just an arbitrary application constraint (say, for a text box or something), use a `TEXT` field at the data layer.
Reply

#4
The preceding answers don't insist enough on the main problem: even in very simple queries like

(SELECT t2.* FROM t1, t2 WHERE t2.id = t1.id ORDER BY t1.id)

a temporary table can be required, and if a `VARCHAR` field is involved, it is converted to a `CHAR` field in the temporary table. So if you have in your table say 500 000 lines with a `VARCHAR(65000)` field, this column alone will use _6.5*5*10^9_ byte. Such temp tables can't be handled in memory and are written to disk. The impact can be expected to be catastrophic.

Source (with metrics):

[To see links please register here]

(This refers to the handling of `TEXT` vs `VARCHAR` in "standard"(?) MyISAM storage engine. It may be different in others, e.g., InnoDB.)
Reply

#5
Varchar is for small data like email addresses, while Text is for much bigger data like news articles, Blob for binary data such as images.

The performance of Varchar is more powerful because it runs completely from memory, but this will not be the case if data is too big like `varchar(4000)` for example.

Text, on the other hand, does not stick to memory and is affected by disk performance, but you can avoid that by separating text data in a separate table and apply a left join query to retrieve text data.

Blob is much slower so use it only if you don't have much data like 10000 images which will cost 10000 records.

Follow these tips for maximum speed and performance:

1. Use varchar for name, titles, emails

2. Use Text for large data

3. Separate text in different tables

4. Use Left Join queries on an ID such as a phone number

5. If you are going to use Blob apply the same tips as in Text

This will make queries cost milliseconds on tables with data >10 M and size up to 10GB guaranteed.
Reply

#6
There is a **HUGE** difference between VARCHAR and TEXT. While VARCHAR fields can be indexed, TEXT fields cannot. VARCHAR type fields are stored inline while TEXT are stored offline, only pointers to TEXT data is actually stored in the records.

If you have to index your field for faster search, update or delete than go for VARCHAR, no matter how big. A VARCHAR(10000000) will never be the same as a TEXT field bacause these two data types are different in nature.

- If you use you field only for archiving
- you don't care about data
speed retrival
- you care about speed but you will use the operator
'%LIKE%' in your search query so indexing will not help much
- you
can't predict a limit of the data length

than go for TEXT.
Reply

#7
* `TEXT` and `BLOB` _may_ by stored off the table with the table just having a pointer to the location of the actual storage. Where it is stored depends on lots of things like data size, columns size, row_format, and MySQL version.

* `VARCHAR` is stored inline with the table. `VARCHAR` is faster when the size is reasonable, the tradeoff of which would be faster depends upon your data and your hardware, you'd want to benchmark a real-world scenario with your data.
Reply

#8
Can you predict how long the user input would be?

> ## [VARCHAR(X)][1]
>> **Max Length:** variable, up to 65,535 bytes (64KB)
>> **Case:** user name, email, country, subject, password
>
> ---
>
> ## [TEXT][2]
>> **Max Length:** 65,535 bytes (64KB)
>> **Case:** messages, emails, comments, formatted text, html, code, images, links
>
> ---
>
> ## [MEDIUMTEXT][2]
>> **Max Length:** 16,777,215 bytes (16MB)
>> **Case:** large json bodies, short to medium length books, csv strings
>
> ---
>
> ## [LONGTEXT][2]
>> **Max Length:** 4,294,967,29 bytes (4GB)
>> **Case:** textbooks, programs, years of logs files, harry potter and the goblet of fire, scientific research logging

There's more information on [this question][3].


[1]:

[To see links please register here]

[2]:

[To see links please register here]

[3]:

[To see links please register here]

Reply

#9
Just a correction to so many answers here, even if it is a bit late to the party.

Text fields can be fully indexed by MySQL as per their documentation.

Link provided

[To see links please register here]


Overall Varchar fields are longer to write to then Text fields, but it only matters if you have multitude of write requests
Reply

#10
**Short answer:** No practical, performance, or storage, difference.

**Long answer:**

There is essentially no difference (in MySQL) between `VARCHAR(3000)` (or any other large limit) and `TEXT`. The former will truncate at 3000 _characters_; the latter will truncate at 65535 _bytes_. (I make a distinction between _bytes_ and _characters_ because a character can take multiple bytes.)

For smaller limits in `VARCHAR`, there are some advantages over `TEXT`.

* "smaller" means 191, 255, 512, 767, or 3072, etc, depending on version, context, and `CHARACTER SET`.
* `INDEXes` are limited in how big a column can be indexed. (767 or 3072 _bytes_; this is version and settings dependent)
* Intermediate tables created by complex `SELECTs` are handled in two different ways -- MEMORY (faster) or MyISAM (slower). When 'large' columns are involved, the slower technique is automatically picked. (Significant changes coming in version 8.0; so this bullet item is subject to change.)
* Related to the previous item, all `TEXT` datatypes (as opposed to `VARCHAR`) jump straight to MyISAM. That is, `TINYTEXT` is automatically worse for generated temp tables than the equivalent `VARCHAR`. (But this takes the discussion in a third direction!)
* `VARBINARY` is like `VARCHAR`; `BLOB` is like `TEXT`.
* A table with several 'large' `VARCHARs` could hit a limit of 64KB for the whole table definition; switching to `TEXT` is a simple and practical fix. (Example:

[To see links please register here]

)

**Rebuttal to other answers**

The original question asked one thing (which datatype to use); the accepted answer answered something else (off-record storage). That answer is now out of date.

When this thread was started _and_ answered, there were only two "row formats" in InnoDB. Soon afterwards, two more formats (`DYNAMIC` and `COMPRESSED`) were introduced.

The storage location for `TEXT` and `VARCHAR()` is based on _size_, not on _name of datatype_. For an **updated** discussion of on/off-record storage of large text/blob columns, see [_this_](

[To see links please register here]

) .
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through