Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 270 Vote(s) - 3.52 Average
  • 1
  • 2
  • 3
  • 4
  • 5
indexing twitter data into elasticsearch: Limit of total fields [1000] in index has been exceeded

#1
I have a system that indexes the Twitter Stream into Elasticsearch. It has been running for a few weeks now.

Lately an error has been showed up that says: `Limit of total fields [1000] in index [dev_tweets] has been exceeded`.

I was wondering, if anyone has encountered the same problem?

In addition if I run this curl:

$ curl -s -XGET

[To see links please register here]

| grep type | wc -l
890

it should give me more or less the number of fields in the mapping. It is a lot of fields, but it isn't more than 1000
Reply

#2
This limit has been introduced in following GitHub [issue][1].

The command counts `grep type | wc -l` counts the number of lines with text **"type"**. Therefore I guess there is a chance for the count to be inaccurate. I did a small text and I got a higher value than the actual number of fields. So you could get less than the actual number of fields as well, but I can't think of a scenario yet.

Here's the test I did.

curl -s -XGET

[To see links please register here]


{
"stackoverflow" : {
"mappings" : {
"os" : {
"properties" : {
"NAME" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"TITLE" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
},
"fielddata" : true
},
"title" : {
"type" : "text",
"fielddata" : true
}
}
}
}
}
}

Since the **"type"** is there in 5 lines I get the output as 5 even though I only have 3 fields.

Can you **try increasing the limit** and see if it works?

PUT my_index/_settings
{
"index.mapping.total_fields.limit": 2000
}

You can also increase this limit during index creation.

PUT my_index
{
"settings": {
"index.mapping.total_fields.limit": 2000,
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
...
}
}

Credits:

[To see links please register here]


[1]:

[To see links please register here]

Reply

#3
You can change the setting of your ES domain, by running following command in the kibana or in postman. Just replace the ElasticSearch URL and the index name and this should run perfectly.

PUT /my_index/_settings HTTP/1.1
Host: search-test-prhtf12546bw2qdr6lfr2vq.us-east-1.es.amazonaws.com
Content-Type: application/json

{
"index": {
"mapping": {
"total_fields": {
"limit": "100000"
}
}
}
}

It will give you following response:

{
"acknowledged": true
}
Reply

#4


studentdoc_setting_index_mapping_type_overlayadjacency.json
{
"index": {
"mapping": {
"total_fields": {
"limit": "100000"
}
}
}
}

@Setting(settingPath = "studentdoc_setting_index_mapping_type_overlayadjacency.json")
public class StudentDoc {
}
Reply

#5
Defining too many fields in an index is a condition that can lead to a mapping explosion, which can cause out of memory errors and difficult situations to recover from. As an example, consider a situation in which every new document inserted introduces new fields. This is quite common with dynamic mappings. Every time a document contains new fields, those will end up in the index’s mappings. This isn’t worrying for a small amount of data, but it can become a problem as the mapping grows.

If you have nested fields which can grow and not under applications control then try to map the field as `flattened`. This data type can be useful for indexing objects with a large or unknown number of unique keys. Only one field mapping is created for the whole JSON object, which can help prevent a mappings explosion from having too many distinct field mappings.

Reference:

[To see links please register here]

Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through