All threads / Elasticsearch server with Hatchbox timing out

Ask A Question

Notifications

You’re not receiving notifications from this thread.

Elasticsearch server with Hatchbox timing out

Alex Deering asked in General

Got another hatchbox issue trying to figure out. With elasticsearch now, I have a dedicated cluster that only runs elasticsearch. In the ENV there is the elasticsearch_url and ive SSH into the elasticsearch cluster and checked the connection and everything comes back good. But when I try to reindex thru the console i get the following:

rake aborted!
Faraday::Error::TimeoutError: request timed out
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:100:in `block in request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:146:in `block in execute_callbacks'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:145:in `each'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:145:in `execute_callbacks'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/operations.rb:35:in `finish'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/easy_factory.rb:164:in `block in set_callback'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `block in complete'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `each'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `complete'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/operations.rb:33:in `perform'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/operations.rb:16:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/cacheable.rb:18:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/block_connection.rb:31:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/stubbable.rb:25:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/before.rb:26:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:82:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:72:in `call'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/middleware.rb:9:in `call'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/faraday-0.17.1/lib/faraday/rack_builder.rb:143:in `build_response'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/faraday-0.17.1/lib/faraday/connection.rb:387:in `run_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/http/faraday.rb:27:in `block in perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/base.rb:274:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/http/faraday.rb:24:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/client.rb:152:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-api-7.4.0/lib/elasticsearch/api/namespace/common.rb:25:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-api-7.4.0/lib/elasticsearch/api/actions/indices/get_alias.rb:36:in `get_alias'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:106:in `all_indices'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:119:in `clean_indices'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:274:in `reindex_scope'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:200:in `reindex'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/model.rb:60:in `searchkick_reindex'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/tasks/searchkick.rake:7:in `block (2 levels) in <main>'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/rake-13.0.1/exe/rake:27:in `<top (required)>'
/home/deploy/.rbenv/versions/2.6.5/bin/bundle:23:in `load'
/home/deploy/.rbenv/versions/2.6.5/bin/bundle:23:in `<main>'
Tasks: TOP => searchkick:reindex
(See full trace by running task with --trace)

You said you logged into the ElasticSearch server and tested it, but if Rails is running on another server, you have to make sure ElasticSearch is attached available externally. You'll have to make sure it's binding to an IP and allowed through your firewall.

It was setup with Hatchbox, here is the provision log:

-----> Connecting to living-recipe-elasticsearch (138.68.24.137 port 22) as root

Uploaded /etc/sudoers.d/deploy

Executing 'chmod 440 /etc/sudoers.d/deploy'

Uploaded .hatch/action-272036.sh

Executing 'set -euxo pipefail; timeout 900 bash .hatch/action-272036.sh  2>&1'
+ timeout 900 bash .hatch/action-272036.sh

-----> Disabling password authentication for SSH

-----> Installing initial dependencies

-----> Adding repositories
Warning: apt-key output should not be parsed (stdout is not a terminal)
OK

-----> Updating packages

-----> Upgrading packages

-----> Installing dependencies
ca-certificates ufw default-jre elasticsearch

-----> Setting up firewall rules
Skipping adding existing rule
Skipping adding existing rule (v6)
Rule added
Skipping adding existing rule
Firewall is active and enabled on system startup

-----> Post-install configuration
Synchronizing state of elasticsearch.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable elasticsearch

-----> Done!

Uploaded /etc/apt/apt.conf.d/10periodic

Uploaded /etc/update-motd.d/60-hatchbox

Executing 'chmod +x /etc/update-motd.d/60-hatchbox'

And the ENV['ELASTICSEARCH_URL'] is pointing to the Private IP of the elasticsearch server.

If you set it up with Hatchbox, then it will do all that for you.

I SSHed into your server and confirmed Hatchbox setup everything correctly.

 curl http://10.138.104.120:9200
{
  "name" : "living-recipe-elasticsearch",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Sounds like your code is misconfigured. Make sure that it's loading from the ENV var.

Hmmm, no idea whats causing it. I checked the curl from the server as well and got the same. And using searchkick with is configured to use ENV['ELASTICSEARCH_URL'] which is set to the same thing. Everything works in development environment as expected and the only fault im getting is the timed out....Any suggestions on what to try to get more info?

Check that your code is configured to use the ENV var. Enable logging for faraday, etc to see what request its making.

So increased the timeout to a minute on Searchkick and that got me a different error. Tried to see if I could find a solution (and will keep looking), but figured Id update here in case you know whats going on:

Elasticsearch::Transport::Transport::Errors::ServiceUnavailable ([503] {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503})

Sounds like the same thing as before. Have you gotten any logging enabled to see what URL it's requesting yet?

I havent been able to figure out how to turn on logs for faraday

If you're using Searchkick, try adding this:

Searchkick.client.transport.logger = Logger.new(STDOUT) # or Rails.logger

Found that after a quick search on the issues.

Ok, this is what I get with that:

I, [2020-01-02T05:20:21.550200 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Started GET "/" for 199.127.56.90 at 2020-01-02 05:20:21 +0000
D, [2020-01-02T05:20:21.652893 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[36mUser Load (1.9ms)[0m  [1m[34mSELECT "users".* FROM "users" WHERE "users"."id" = $1 ORDER BY "users"."id" ASC LIMIT $2[0m  [["id", 1], ["LIMIT", 1]]
I, [2020-01-02T05:20:21.698751 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Processing by RecipesController#index as HTML
D, [2020-01-02T05:20:21.725351 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (5.9ms)[0m  [1m[34mSELECT "saved_recipes"."recipe_id" FROM "saved_recipes" WHERE "saved_recipes"."user_id" = $1[0m  [["user_id", 1]]
D, [2020-01-02T05:20:21.734363 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (4.4ms)[0m  [1m[34mSELECT DISTINCT "taggings"."tag_id" FROM "taggings" WHERE 1=0[0m
D, [2020-01-02T05:20:21.745764 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (1.7ms)[0m  [1m[34mSELECT COUNT(*) AS count_all, "taggings"."tag_id" AS taggings_tag_id FROM "taggings" WHERE 1=0 GROUP BY "taggings"."tag_id"[0m
D, [2020-01-02T05:20:21.765707 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[33mRecipe Search (18.9ms)[0m  curl http://10.138.104.120:9200/recipes_production/_search?pretty -H 'Content-Type: application/json' -d '{"aggs":{"course_name":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"course_name":{"terms":{"field":"course_name","size":1000,"order":{"_key":"asc"}}}}},"cuisines":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"cuisines":{"terms":{"field":"cuisines","size":1000,"order":{"_count":"desc"}}}}},"techniques":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"techniques":{"terms":{"field":"techniques","size":1000,"order":{"_count":"desc"}}}}}},"query":{"bool":{"must":{"match_all":{}},"filter":[{"term":{"approved":true}},{"term":{"master":true}}]}},"sort":{"_score":"desc"},"timeout":"61s","_source":false,"size":30}'
I, [2020-01-02T05:20:21.766187 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Completed 500 Internal Server Error in 67ms (Searchkick: 18.9ms | ActiveRecord: 24.4ms | Allocations: 6790)
F, [2020-01-02T05:20:21.766946 #8714] FATAL -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Elasticsearch::Transport::Transport::Errors::ServiceUnavailable ([503] {"error":{"root_cause":[{"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"}],"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"},"status":503}):
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] app/helpers/recipe_search.rb:23:in `search'
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] app/controllers/recipes_controller.rb:18:in `index'
I, [2020-01-02T05:20:21.767808 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Processing by ErrorsController#internal_server_error as HTML
I, [2020-01-02T05:20:21.770647 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendering errors/internal_server_error.html.erb within layouts/errors
I, [2020-01-02T05:20:21.771662 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendered errors/internal_server_error.html.erb within layouts/errors (Duration: 0.9ms | Allocations: 225)
I, [2020-01-02T05:20:21.774468 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendered shared/_favicons.html.erb (Duration: 0.4ms | Allocations: 127)
I, [2020-01-02T05:20:21.774965 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Completed 500 Internal Server Error in 7ms (Views: 6.3ms | ActiveRecord: 0.0ms | Allocations: 2620)

So it is going to the right url.

Tested the connection with both the deployed and my development machine and compared the results:
Deployed:

{
  "name" : "living-recipe-elasticsearch",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Localhost:

{
  "name" : "ubuntu",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "uA-3VHNeQ5mzozcnB8R-ww",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Everything is the same except for the cluster_uuid is "na" for the deployed version

Just an update, after uninstalling and reinstalling elasticsearch, still getting the same issue. For now Ive moved to a managed service (Elastic Cloud) and its working. So pretty sure it was something to do with the install on the server. I will reattack at a later date

Hey Alex,

Good catch on the cluster_uuid. That does seem strange and maybe a recent version of ElasticSearch has broken my config changes. I'll do some research and see what I find.

Getting a response from ElasticSearch fooled me into thinking that everything was running just fine but maybe it gives a Service Unavailable error if it is running but doesn't have a cluster active / connected or something. I'm not super familiar with ES and it seems to have changed a bit recently.

Just an update for you. I ended up switching to a managed elasticsearch (Elastic Cloud/AWS) since Im not super familiar with ES either and could get it up and running for now.

As for the issue, it mightve been the server size. I was using a $20 (4gb, 2CPU) server from DigitalOcean for only the ES service. Some of the responses I was finding while troubleshooting on stackoverflow was that if there isnt enough memory when creating it can fail starting up. Might be worth testing with different size Servers and seeing if a beefier server allows it to start up properly.

Also I did SSH into the server and uninstalled ES and reinstalled it following the same procedures I did on local dev machine and got the same UUID error. Which is what eventually led me to think it might be a resource issue with the server. Hope this helps with any troubleshooting you do with your deploy scripts.

Hatchbox is awesome and has saved me a lot of work btw. Living-recipe.com is up and testing quickly thanks to your services. Thanks again for the help

ElasticSearch is really picky about RAM I've noticed. If it doesn't have enough it can fail to start and it needs 4GB RAM to itself minimum. A 4GB of RAM server won't suffice for it, so you've got to go with 8GB if you also want to run Rails on the same server, and even then you have to keep an eye on it.

This is where a managed ElasticSearch is definitely worth the money. You have to worry about none of these details and can just use it. 👍

Alex, you were onto something with the cluster_uuid. I found out that you can find more information by running curl http://localhost:9200/_cluster/health instead.

That pointed out "master_not_discovered_exception" so it's a configuration issue that the ES server is not seeing itself as a master in the cluster.

Looks like production mode expects multiple servers now, but we can get around that by adding discovery.type: single-node and that should fix it. I'll go update the Hatchbox configs for that.

Nice, glad you were able to figure that out. Hatchbox is amazing so glad I could help with something, even if its just pointing you in the direction of the problem.

Thank you for the heads up that it wasn't working! I appreciate you!

Join the discussion

Want to stay up-to-date with Ruby on Rails?

Join 38,558+ developers who get early access to new tutorials, screencasts, articles, and more.

    We care about the protection of your data. Read our Privacy Policy.

    logo Created with Sketch.

    Ruby on Rails tutorials, guides, and screencasts for web developers learning Ruby, Rails, Javascript, Turbolinks, Stimulus.js, Vue.js, and more. Icons by Icons8

    © 2020 GoRails, LLC. All rights reserved.