07-30-2023, 02:10 PM
**<h1>My Understanding on auto_bootstrap is</h1>**
Below are my understanding about `auto_bootstrap` property. At first, please correct me if I am wrong at any point.
Initially the property ‘`auto_bootstrap`’ will not be available in the `cassandra.yaml` file. This means that the default value was ‘true’.
**true** - this means that bootstrap/stream the data to the respective node from all the other nodes while starting/restarting <br>
**false** - do not stream the data while starting/restarting
**Where do we need ‘auto_bootstrap: true’**
1) When a new node needs to be added in the existing cluster, this needs to set to **‘true’** to bootstrap the data automatically from all the other nodes in the cluster. This will take some considerable amount of time (based on the current load of the cluster) to get the new node added in the cluster. But this will make the load balance automatically in the cluster.
**Where do we need ‘auto_bootstrap: false’**
1) When a new node needs to be added quickly in the existing cluster without bootstrapping the data, this needs to set to **‘false’**. The new node will be added quickly irrespective of the current load of the cluster. Later we need to manually stream the data to the new node to make cluster load balanced.
2) When initializing the fresh cluster with no data, this needs to set to **‘false’**. At least the first seed node to be started/added in the fresh cluster should have the value as ‘false’.
**<h1>My Question is</h1>**
We are using Cassandra 2.0.3 of six nodes with two data centers (each has 3 nodes). Our Cassandra is a ***stand-alone process*** (not service). I am going to change few properties in `cassandra.yaml` file for one node. It is apparent that node should be restarted after updating the `cassandra.ymal` file to take the changes effect. Our cluster is loaded with huge data.
**How to restart the node**<br>
After killing the node, I can simply restart the node as below <br>
$ cd install_location
$ bin/cassandra
This means that restart the node with no `auto_bootstrap` property (default is true).
**with 'true'**
1) The node to be restarted currently has its own huge data. Does the node bootstrap again all its own data and replace the existing data. <br>
2) Will it take more time the node to join the cluster again.
**with 'false'**
I do not want to bootstrap the data. So <br>
3) Can I add the property as `auto_bootstrap: false` and restart the node as mentioned above. <br>
4) After successful restart I will go and delete the auto_bootstrap property. Is that okay?
**Else**
5) As I am restarting the node with the same ip address, Will the cluster automatically identify that this is an existing node through gossip info and hence restart the node without streaming the data despite auto_bootstrap is set to true or not present in `cassandra.yaml` file?
Below are my understanding about `auto_bootstrap` property. At first, please correct me if I am wrong at any point.
Initially the property ‘`auto_bootstrap`’ will not be available in the `cassandra.yaml` file. This means that the default value was ‘true’.
**true** - this means that bootstrap/stream the data to the respective node from all the other nodes while starting/restarting <br>
**false** - do not stream the data while starting/restarting
**Where do we need ‘auto_bootstrap: true’**
1) When a new node needs to be added in the existing cluster, this needs to set to **‘true’** to bootstrap the data automatically from all the other nodes in the cluster. This will take some considerable amount of time (based on the current load of the cluster) to get the new node added in the cluster. But this will make the load balance automatically in the cluster.
**Where do we need ‘auto_bootstrap: false’**
1) When a new node needs to be added quickly in the existing cluster without bootstrapping the data, this needs to set to **‘false’**. The new node will be added quickly irrespective of the current load of the cluster. Later we need to manually stream the data to the new node to make cluster load balanced.
2) When initializing the fresh cluster with no data, this needs to set to **‘false’**. At least the first seed node to be started/added in the fresh cluster should have the value as ‘false’.
**<h1>My Question is</h1>**
We are using Cassandra 2.0.3 of six nodes with two data centers (each has 3 nodes). Our Cassandra is a ***stand-alone process*** (not service). I am going to change few properties in `cassandra.yaml` file for one node. It is apparent that node should be restarted after updating the `cassandra.ymal` file to take the changes effect. Our cluster is loaded with huge data.
**How to restart the node**<br>
After killing the node, I can simply restart the node as below <br>
$ cd install_location
$ bin/cassandra
This means that restart the node with no `auto_bootstrap` property (default is true).
**with 'true'**
1) The node to be restarted currently has its own huge data. Does the node bootstrap again all its own data and replace the existing data. <br>
2) Will it take more time the node to join the cluster again.
**with 'false'**
I do not want to bootstrap the data. So <br>
3) Can I add the property as `auto_bootstrap: false` and restart the node as mentioned above. <br>
4) After successful restart I will go and delete the auto_bootstrap property. Is that okay?
**Else**
5) As I am restarting the node with the same ip address, Will the cluster automatically identify that this is an existing node through gossip info and hence restart the node without streaming the data despite auto_bootstrap is set to true or not present in `cassandra.yaml` file?