Elasticsearch server with Hatchbox timing out

Alex Deering January 1, 2020 11:37pm

Got another hatchbox issue trying to figure out. With elasticsearch now, I have a dedicated cluster that only runs elasticsearch. In the ENV there is the elasticsearch_url and ive SSH into the elasticsearch cluster and checked the connection and everything comes back good. But when I try to reindex thru the console i get the following:

rake aborted!
Faraday::Error::TimeoutError: request timed out
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:100:in `block in request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:146:in `block in execute_callbacks'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:145:in `each'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/callbacks.rb:145:in `execute_callbacks'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/operations.rb:35:in `finish'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/easy_factory.rb:164:in `block in set_callback'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `block in complete'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `each'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/response_callbacks.rb:68:in `complete'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/ethon-0.12.0/lib/ethon/easy/operations.rb:33:in `perform'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/operations.rb:16:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/cacheable.rb:18:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/block_connection.rb:31:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/stubbable.rb:25:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/request/before.rb:26:in `run'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:82:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/typhoeus-1.3.1/lib/typhoeus/adapters/faraday.rb:72:in `call'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/middleware.rb:9:in `call'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/faraday-0.17.1/lib/faraday/rack_builder.rb:143:in `build_response'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/faraday-0.17.1/lib/faraday/connection.rb:387:in `run_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/http/faraday.rb:27:in `block in perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/base.rb:274:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/transport/http/faraday.rb:24:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-transport-7.4.0/lib/elasticsearch/transport/client.rb:152:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-api-7.4.0/lib/elasticsearch/api/namespace/common.rb:25:in `perform_request'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/elasticsearch-api-7.4.0/lib/elasticsearch/api/actions/indices/get_alias.rb:36:in `get_alias'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:106:in `all_indices'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:119:in `clean_indices'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:274:in `reindex_scope'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/index.rb:200:in `reindex'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/searchkick/model.rb:60:in `searchkick_reindex'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/searchkick-4.1.1/lib/tasks/searchkick.rake:7:in `block (2 levels) in <main>'
/home/deploy/living-recipe/shared/bundle/ruby/2.6.0/gems/rake-13.0.1/exe/rake:27:in `<top (required)>'
/home/deploy/.rbenv/versions/2.6.5/bin/bundle:23:in `load'
/home/deploy/.rbenv/versions/2.6.5/bin/bundle:23:in `<main>'
Tasks: TOP => searchkick:reindex
(See full trace by running task with --trace)

Chris Oliver January 1, 2020 11:45pm

You said you logged into the ElasticSearch server and tested it, but if Rails is running on another server, you have to make sure ElasticSearch is attached available externally. You'll have to make sure it's binding to an IP and allowed through your firewall.

Alex Deering January 1, 2020 11:50pm

It was setup with Hatchbox, here is the provision log:

-----> Connecting to living-recipe-elasticsearch (138.68.24.137 port 22) as root

Uploaded /etc/sudoers.d/deploy

Executing 'chmod 440 /etc/sudoers.d/deploy'

Uploaded .hatch/action-272036.sh

Executing 'set -euxo pipefail; timeout 900 bash .hatch/action-272036.sh  2>&1'
+ timeout 900 bash .hatch/action-272036.sh

-----> Disabling password authentication for SSH

-----> Installing initial dependencies

-----> Adding repositories
Warning: apt-key output should not be parsed (stdout is not a terminal)
OK

-----> Updating packages

-----> Upgrading packages

-----> Installing dependencies
ca-certificates ufw default-jre elasticsearch

-----> Setting up firewall rules
Skipping adding existing rule
Skipping adding existing rule (v6)
Rule added
Skipping adding existing rule
Firewall is active and enabled on system startup

-----> Post-install configuration
Synchronizing state of elasticsearch.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable elasticsearch

-----> Done!

Uploaded /etc/apt/apt.conf.d/10periodic

Uploaded /etc/update-motd.d/60-hatchbox

Executing 'chmod +x /etc/update-motd.d/60-hatchbox'

And the ENV['ELASTICSEARCH_URL'] is pointing to the Private IP of the elasticsearch server.

Chris Oliver January 1, 2020 11:57pm

If you set it up with Hatchbox, then it will do all that for you.

I SSHed into your server and confirmed Hatchbox setup everything correctly.

 curl http://10.138.104.120:9200
{
  "name" : "living-recipe-elasticsearch",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Sounds like your code is misconfigured. Make sure that it's loading from the ENV var.

Alex Deering January 2, 2020 12:54am

Hmmm, no idea whats causing it. I checked the curl from the server as well and got the same. And using searchkick with is configured to use ENV['ELASTICSEARCH_URL'] which is set to the same thing. Everything works in development environment as expected and the only fault im getting is the timed out....Any suggestions on what to try to get more info?

Chris Oliver January 2, 2020 12:57am

Check that your code is configured to use the ENV var. Enable logging for faraday, etc to see what request its making.

Alex Deering January 2, 2020 4:25am

So increased the timeout to a minute on Searchkick and that got me a different error. Tried to see if I could find a solution (and will keep looking), but figured Id update here in case you know whats going on:

Elasticsearch::Transport::Transport::Errors::ServiceUnavailable ([503] {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503})

Chris Oliver January 2, 2020 4:27am

Sounds like the same thing as before. Have you gotten any logging enabled to see what URL it's requesting yet?

Alex Deering January 2, 2020 4:36am

I havent been able to figure out how to turn on logs for faraday

Chris Oliver January 2, 2020 4:38am

If you're using Searchkick, try adding this:

Searchkick.client.transport.logger = Logger.new(STDOUT) # or Rails.logger

Found that after a quick search on the issues.

Alex Deering January 2, 2020 5:27am

Ok, this is what I get with that:

I, [2020-01-02T05:20:21.550200 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Started GET "/" for 199.127.56.90 at 2020-01-02 05:20:21 +0000
D, [2020-01-02T05:20:21.652893 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[36mUser Load (1.9ms)[0m  [1m[34mSELECT "users".* FROM "users" WHERE "users"."id" = $1 ORDER BY "users"."id" ASC LIMIT $2[0m  [["id", 1], ["LIMIT", 1]]
I, [2020-01-02T05:20:21.698751 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Processing by RecipesController#index as HTML
D, [2020-01-02T05:20:21.725351 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (5.9ms)[0m  [1m[34mSELECT "saved_recipes"."recipe_id" FROM "saved_recipes" WHERE "saved_recipes"."user_id" = $1[0m  [["user_id", 1]]
D, [2020-01-02T05:20:21.734363 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (4.4ms)[0m  [1m[34mSELECT DISTINCT "taggings"."tag_id" FROM "taggings" WHERE 1=0[0m
D, [2020-01-02T05:20:21.745764 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[35m (1.7ms)[0m  [1m[34mSELECT COUNT(*) AS count_all, "taggings"."tag_id" AS taggings_tag_id FROM "taggings" WHERE 1=0 GROUP BY "taggings"."tag_id"[0m
D, [2020-01-02T05:20:21.765707 #8714] DEBUG -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   [1m[33mRecipe Search (18.9ms)[0m  curl http://10.138.104.120:9200/recipes_production/_search?pretty -H 'Content-Type: application/json' -d '{"aggs":{"course_name":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"course_name":{"terms":{"field":"course_name","size":1000,"order":{"_key":"asc"}}}}},"cuisines":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"cuisines":{"terms":{"field":"cuisines","size":1000,"order":{"_count":"desc"}}}}},"techniques":{"filter":{"bool":{"must":[{"term":{"approved":true}},{"term":{"master":true}}]}},"aggs":{"techniques":{"terms":{"field":"techniques","size":1000,"order":{"_count":"desc"}}}}}},"query":{"bool":{"must":{"match_all":{}},"filter":[{"term":{"approved":true}},{"term":{"master":true}}]}},"sort":{"_score":"desc"},"timeout":"61s","_source":false,"size":30}'
I, [2020-01-02T05:20:21.766187 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Completed 500 Internal Server Error in 67ms (Searchkick: 18.9ms | ActiveRecord: 24.4ms | Allocations: 6790)
F, [2020-01-02T05:20:21.766946 #8714] FATAL -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Elasticsearch::Transport::Transport::Errors::ServiceUnavailable ([503] {"error":{"root_cause":[{"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"}],"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"},"status":503}):
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] app/helpers/recipe_search.rb:23:in `search'
[e05340b7-b4d6-4e1d-bedd-11b48ee75d62] app/controllers/recipes_controller.rb:18:in `index'
I, [2020-01-02T05:20:21.767808 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Processing by ErrorsController#internal_server_error as HTML
I, [2020-01-02T05:20:21.770647 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendering errors/internal_server_error.html.erb within layouts/errors
I, [2020-01-02T05:20:21.771662 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendered errors/internal_server_error.html.erb within layouts/errors (Duration: 0.9ms | Allocations: 225)
I, [2020-01-02T05:20:21.774468 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62]   Rendered shared/_favicons.html.erb (Duration: 0.4ms | Allocations: 127)
I, [2020-01-02T05:20:21.774965 #8714]  INFO -- : [e05340b7-b4d6-4e1d-bedd-11b48ee75d62] Completed 500 Internal Server Error in 7ms (Views: 6.3ms | ActiveRecord: 0.0ms | Allocations: 2620)

So it is going to the right url.

Alex Deering January 2, 2020 3:33pm

Tested the connection with both the deployed and my development machine and compared the results:
Deployed:

{
  "name" : "living-recipe-elasticsearch",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Localhost:

{
  "name" : "ubuntu",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "uA-3VHNeQ5mzozcnB8R-ww",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Everything is the same except for the cluster_uuid is "na" for the deployed version

Alex Deering January 2, 2020 5:23pm

Just an update, after uninstalling and reinstalling elasticsearch, still getting the same issue. For now Ive moved to a managed service (Elastic Cloud) and its working. So pretty sure it was something to do with the install on the server. I will reattack at a later date

Chris Oliver January 4, 2020 1:24am

Hey Alex,

Good catch on the cluster_uuid. That does seem strange and maybe a recent version of ElasticSearch has broken my config changes. I'll do some research and see what I find.

Getting a response from ElasticSearch fooled me into thinking that everything was running just fine but maybe it gives a Service Unavailable error if it is running but doesn't have a cluster active / connected or something. I'm not super familiar with ES and it seems to have changed a bit recently.

Alex Deering January 4, 2020 4:19pm

Just an update for you. I ended up switching to a managed elasticsearch (Elastic Cloud/AWS) since Im not super familiar with ES either and could get it up and running for now.

As for the issue, it mightve been the server size. I was using a $20 (4gb, 2CPU) server from DigitalOcean for only the ES service. Some of the responses I was finding while troubleshooting on stackoverflow was that if there isnt enough memory when creating it can fail starting up. Might be worth testing with different size Servers and seeing if a beefier server allows it to start up properly.

Also I did SSH into the server and uninstalled ES and reinstalled it following the same procedures I did on local dev machine and got the same UUID error. Which is what eventually led me to think it might be a resource issue with the server. Hope this helps with any troubleshooting you do with your deploy scripts.

Hatchbox is awesome and has saved me a lot of work btw. Living-recipe.com is up and testing quickly thanks to your services. Thanks again for the help

Chris Oliver January 6, 2020 4:14pm

ElasticSearch is really picky about RAM I've noticed. If it doesn't have enough it can fail to start and it needs 4GB RAM to itself minimum. A 4GB of RAM server won't suffice for it, so you've got to go with 8GB if you also want to run Rails on the same server, and even then you have to keep an eye on it.

This is where a managed ElasticSearch is definitely worth the money. You have to worry about none of these details and can just use it. 👍

Chris Oliver January 7, 2020 11:15pm

Alex, you were onto something with the cluster_uuid. I found out that you can find more information by running curl http://localhost:9200/_cluster/health instead.

That pointed out "master_not_discovered_exception" so it's a configuration issue that the ES server is not seeing itself as a master in the cluster.

Looks like production mode expects multiple servers now, but we can get around that by adding discovery.type: single-node and that should fix it. I'll go update the Hatchbox configs for that.

Chris Oliver January 7, 2020 11:19pm

Also should mention, I found that information here: https://www.elastic.co/guide/en/elasticsearch/reference/7.3/bootstrap-checks.html#_forcing_the_bootstrap_checks

Alex Deering January 7, 2020 11:20pm

Nice, glad you were able to figure that out. Hatchbox is amazing so glad I could help with something, even if its just pointing you in the direction of the problem.

Chris Oliver January 7, 2020 11:40pm

Thank you for the heads up that it wasn't working! I appreciate you!

Rails for Beginners

Advanced Ruby: Behind the Magic

Payments with Rails Master Class

Refactoring Rails

Learn Hotwire

Install and Deploy Rails Guides

Hatchbox.io

Jumpstart Rails SaaS Template

Remote Ruby Podcast

GoRails Open Source

Rails Hackathon

Beginner Bounties

Ruby on Rails Job Board

Notifications

Elasticsearch server with Hatchbox timing out

Want to stay up-to-date with Ruby on Rails?