Mark Radford
Joined
Activity
I think Jacob has given some great advice, especially with regards to using pluck
Nothing aginst Heroku, but due to the way they bill, I would be concerned of having spike charges due to things like this.
I don't think you get charged extra from Heroku for having spikes and exceeding your memory quota. I think when you exceed your memory quota you start to use swap memory, which is less than desirable, but I don't think you are charged extra. https://devcenter.heroku.com/articles/ruby-memory-use
Posted in Stripe gems: Koudoku or Payola
Thanks all for taking the time to give feedback. Looks like vanilla Stripe API is the way to go.
Off topic:
I've tried a couple of these gems and often found I was boxed into the way they do things which I could break out of, but at that point I'd basically not be using their gem that much.
I applied similar logic to avoiding Devise. In hindsight this may have been a mistake as Devise seems extemely poplular.
Posted in Stripe gems: Koudoku or Payola
Has anyone used either https://github.com/andrewculver/koudoku or https://github.com/payolapayments/payola ? If yes, are you able to give any feedback regarding ease of use, ability to customise, etc? Did you stick with the gem or did you end up implementing the Stripe code yourself?
Thanks.
I posted the comment above in an existing Searchkick issue and the author responded with:
My guess is you need to use misspellings: false. Also, to help with debugging queries and mappings, you can use the recently added:
Product.search("something", debug: true)
Yeah, for me if I search for a Product for number "pm07" then I only want that product returned, I don't want "pm01" or "pm03" returned. I was only able to get this to work by using autocomplete:true
but I can't figure out why.
If we look at what's created for word_start
we find:
Mapping
"product_number" : {
"type" : "keyword",
"fields" : {
"analyzed" : {
"type" : "text"
},
"word_start" : {
"type" : "text",
"analyzer" : "searchkick_word_start_index"
}
},
"ignore_above" : 256
},
Analyzer
searchkick_word_start_index: {
type: "custom",
tokenizer: "standard",
filter: ["lowercase", "asciifolding", "searchkick_edge_ngram"]
},
searchkick_edge_ngram filter
searchkick_edge_ngram: {
type: "edgeNGram",
min_gram: 1,
max_gram: 50
},
If we look at what's created for autocomplete
we find:
Mapping
"product_number" : {
"type" : "keyword",
"fields" : {
"analyzed" : {
"type" : "text"
},
"autocomplete" : {
"type" : "text",
"analyzer" : "searchkick_autocomplete_index"
}
},
"ignore_above" : 256
}
Analyzer
"searchkick_autocomplete_index" : {
"filter" : ["lowercase","asciifolding"],
"type" : "custom",
"tokenizer" : "searchkick_autocomplete_ngram"
},
Tokenizer
tokenizer: {
searchkick_autocomplete_ngram: {
type: "edgeNGram",
min_gram: 1,
max_gram: 50
}
}
So I think both word_start
and autcomplete
use lowercase, asciifolding and edgeNGram.
The difference I think comes in the search query and the use of autocomplete: true
. So with word_start we can simply use:
Product.search "pm07"
whereas with autocomplete we have:
Product.search "pm07", autocomplete: true
which I think then uses the following code:
if options[:autocomplete]
payload = {
multi_match: {
fields: fields,
query: term,
analyzer: "searchkick_autocomplete_search"
}
}
searchkick_autocomplete_search: {
type: "custom",
tokenizer: "keyword",
filter: ["lowercase", "asciifolding"]
},
At this point in time I can't figure out what payload
code is called/used for word_start
and how it differs to that used by autocomplete
Hi Chris,
Unrelated to all of the multitenancy and reindexing talk, how come you didn't need to use autocomplete: true
in your search query and also set the autocomplete
field in your search data? I've been struggling to get my search queries to match as I wanted and it only worked when I incorporated autocomplete
. I previously tried this like word_start
, word_end
etc with no luck.
Thanks.
Yeah, my apologies for taking so long to clear this up. Are you familiar with that implementation? Unrelated to elasticsearch but I've read:
I don't like the default_scope for the reason that it is not threadsafe. The user id is stored in a class variable, which means that two or more concurrent users in your app will break this unless you use Unicorn or some other web server that makes sure no more than one single client connection will access the same thread.
http://stackoverflow.com/a/22534147/1299792
I've responded to that comment with:
In the railscast Ryan said: "We can find another potential issue in the Tenant model where we call cattr_accessor for the current_id attribute. While this is convenient it’s not really thread-safe so we might want to do something like this instead: Thread.current[:tenant_id] = id, Now we have getter and setter methods that use Thread.current to set the value which is more thread-safe". Do you still feel using default_scope with this implementation is not thread-safe?
My apologies Chris, I see I've created some confusion with my original post. When I said "I followed the suggested article in the readme", I was referring to the Searchkick readme where it says "Check out this great post on the Apartment gem. Follow a similar pattern if you use another gem". I never explicitly stated that I wasn't using the Apartment gem, which I am not.
I set up multitenancy in my app following this railscast, which uses scopes.
writing a transaction and committing everything at once. You'll have a lot more speedups writing everything in bulk.
I was trying to figure out how to do that in my custom task before I accepted that I'm probably wasting too much time and should just put a workaround in place to use the defaults offered by the gem
With regards to the scope
, in the latest release of Rails 4, without my tenant/business set, if I run:
Grandparent.unscoped { Parent.unscoped.joins(:grandparent).where(id: self.parent_id).pluck("grandparent.column_name") }
then the business scope
is still applied and no result is returned (the query contains AND "grandparents"."business_id" IS NULL
wherease, in Rails 5, if I run the same command the business scope
is not applied and this time I will receive the result I expect (the query does not contain AND "grandparents"."business_id"
So I figure if I setup my search_data
to use unscoped
in blocks as per what works in Rails 5, then I will be able to use the standard Searchkick Model.reindex
rather than record.reindex
and avoid needing to create my own custom task that allows for zero downtime reindexing when using (import: false)
As an update, I wouldn't advise reindexing by the individual record when you have a large amount of data. My custom rake task has been running for approximately 18 hours and it's still not finished. This approach does not allow for zero downtime reindexing either, which isn't a problem if you don't plan on changing the Searchkick mappings/structure, but if you do, you'll need to write some custom code to try and perform zero downtime with using import: false
. So far for me, creating the custom task is taking a lot of time and doesn't seem worth it.
I'm upgrading my app to Rails 5 at the moment which includes the scope
patch, so I will go back to using the default Searchkick methods and scope my search_data
, ie:
class Product < ActiveRecord::Base
belongs_to :department
def search_data
{
name: name,
department_name: Department.unscoped.find(self.department_id).name,
grandparent_column: Grandparent.unscope {Parent.unscoped.joins(:grandparent).where(id: parent_id).pluck("grandparent.column_name")}
}
end
end
For future reference, after pulling code from the Searchkick gem, my custom rake task (that I am currently moving on from) began to look like the below, though I haven't applied tenant/business scoping yet:
#scope = searchkick_klass
searchkick_index = Searchkick::Index.new(Department.searchkick_index.name, Department.searchkick_options)
searchkick_index.clean_indices
index = create_index(index_options: Department.searchkick_klass.searchkick_index_options)
# check if alias exists
if searchkick_index.alias_exists?
# import before swap
Department.searchkick_klass.find_in_batches batch_size: 1000 do |records|
if records.any?
event = {
name: "#{records.first.searchkick_klass.name} Import",
count: records.size
}
ActiveSupport::Notifications.instrument("request.searchkick", event) do
super(records)
end
end
end
end
# get existing indices to remove
searchkick_index.swap(index.name)
searchkick_index.clean_indices
index.refresh
```
So that
import: false
basically just tells it to create a blank index and ignore all the records in the database right?
Yes, that's what it appears to do in my testing. A blank index with your specific Searchkick settings applied (whereas record.reindex
will not create the index correctly if it hasn't been previously created).
I agree that it's a good idea for me to see what Model.reindex
does, and use that code for my own method, I'm just struggling a little to read the gem code and that's why I thought it would be a great idea for a short episode (well for me anyway). How to understand how to read a gem and figure out where a method is, how it's called and what it's doing. But again I understand that this may be too specific for a general episode to be useful to a large number of people. I'm sure I'll figure it out with some perseverance.
Maybe you could do a short episode on reading the gem source for searchkick to determine what the reindex
method does? I don't know if that's too specific to be useful to your whole customer base.
With reindexing by the individual record I found out that I needed to use Product.reindex(import: false)
to create the index first as using simply record.reindex
wouldn't apply the Searchkick settings (ie search_data
, index name, etc). Discussed here.
I guess reindexing by the record does have the disadvantage that every time I:
- install or upgrade searchkick
- change the search_data method
- change the searchkick method
I'm going to need to recreate my indices again with Product.reindex(import: false)
, and then loop through the records (with ActiveRecord) to reindex each record individually. So it's obviously not an ideal way of doing things, but the only other alternative is using unscoped
in a block with the patch applied (which works in Rails 5). I would assume that Model.reindex
could be significantly faster than looping through with record.reindex
Thanks for the informative and reassuring response. Gem code does feel daunting to me at this stage but remembering that it is just regular ruby code does help. I will continue to look at the source and expand that comfort zone.
Thank you for all of the help and thank you for GoRails.
Yes, Business.current_id
sets the tenant.
It's working now using reindex
on the individual records like you suggested. The callbacks for updating the item in the index are also working. Elasticsearch (with Sidekick) is quite impressive to see it up and running when it's working. Thank you so much for all of the time and effort you have given.
Are there any GoRails episodes that you would recommend for learning how to use gems in general? By this I mean, unless functionality is specified in the readme, I struggle to understand how it works. I often look in http://www.rubydoc.info/gems/ without much success. I occasionally download the source code for the gem.
An example of a problem is the code you previously referenced: Searchkick.models.each do |model|
I tried Searchkick.models
but nothing was returned. So I looked in the usual places (readme and rubydoc.info) and couldn't find any helpful information on Searchkick.models
. The link you gave that supplied the sample code also stated Searchkick.models method is available on versions 0.8.6+
, I'm using a version later than that so that shouldn't be the problem. Are there any episodes that could help me improve this type of learning of gems and their functionality?
So, for example, I could do something like:
Business.current_id = 1
Products.all.each do |product|
product.reindex
end
Business.current_id = 2
repeat above
????
I didn't know you could call reindex
on each record. I'll give that a try.
Regarding unscoped, when I change my code to use a block it should then therefore ignore the scope, but it doesn't. So I think I am running into the bug when I'm trying to use a workaround with a block (I could be wrong).
Thanks for taking the time to reply.
I previously implemented multitenancy with scopes following this railscast
Where you said:
You'd loop through each tenant, set the Apartment Tenant, and then index each of the records inside of it (rather than in bulk)
...
except that you're not specifying separate indexes.
Seeing as I'm not specifying separate indexes, then I believe when I change the tenant (business for me) and reindex
then the index will not contain results for both businesses, only the one that I most recently reindexed for.
For example, product.rb:
default_scope { where(business_id: Business.current_id) }
searchkick index_name: -> { [ model_name.plural, Rails.env].join('_') }, settings: {number_of_shards: 1, number_of_replicas: 1}
custom rake task:
Business.current_id = 1
Product.redindex
Business.current_id = 2
Product.reindex
If we perform a search after the custom rake task then it will only have products for business.id = 2
The reason I was looking into the rails bug with scoping was because I want to use a join
within my search_data
:
def search_data
{
column_name: Parent.unscoped.joins(:grandparent).where(id: parent_id).pluck("grandparent.column_name")[0].presence || "",
}
end
With my current version of Rails (4.2.7) the unscoped
is not applied properly. I believe the team decided that is correct behaviour but not when used in a block, so:
Grandparent.unscope do
Parent.unscoped.joins(:grandparent).where(id: parent_id).pluck("grandparent.column_name")
end
Should ignore the scope with the patch applied, but for me I can't get it to work with 4-2-stable (though I can get it to work on my Rails 5 test branch)
Rails has a bug with scoping where unscoped is not applied to the block. This is discussed here, with a solution here. This was also back ported to Rails 4.2 in the stable branch https://github.com/rails/rails/pull/25232
I've tried to use stable with gem 'rails', :git => 'https://github.com/rails/rails.git', :branch => '4-2-stable'
, however, the bug still seems to exist for me. I've tried to search the rails code to see if the code for the patch is present but I can't find it. I was influenced by this comment
Any suggestions on how I can make sure I'm running 4-2-stable with the commit I need?
Thanks for replying Chris. I actually scope my queries to business which is similar to organization, I just used tenant as the example because I thought this was the common terminology. I'm going to keep trying my current implementation as per above but I'm going to try using Cloud Front in production as they don't have artificial limits on indices and shards as some other providers do.
Has anyone had any success indexing their multitenant data with searchkick? I followed the suggested article in the readme (https://www.tiagoamaro.com.br/2014/12/11/multi-tenancy-with-searchkick/) but this results in an index for each tenant/model combination which will become expensive and is not scalable.
Therefore, I am trying to create one index per model, for all tenants (ie one index for the model rather than 100 indicies if I have 100 users). When I try to run reindex
I run into an issue because the default scope is applied and no data is returned where tenant_id
is null.
I can get around the default scope issue by using something like Product.unscoped.reindex(accept_danger:true)
, however, the default scope is still called when loading associated data. So rather than:
class Product < ActiveRecord::Base
belongs_to :department
def search_data
{
name: name,
department_name: department.name,
on_sale: sale_price.present?
}
end
end
I need to use:
class Product < ActiveRecord::Base
belongs_to :department
def search_data
{
name: name,
department_name: Department.unscoped.find(self.department_id).name,
on_sale: sale_price.present?
}
end
end
Can anyone suggest a better way of using reindex
with this multitenancy setup?
Posted in Advanced Search, Autocomplete and Suggestions with ElasticSearch and the Searchkick gem Discussion
@excid3:disqus I believe you can also use the newly added `indices_boost` (https://github.com/ankane/s... with multiple indices as per the updated readme:
Search across multiple indices with:
`Searchkick.search "milk", index_name: [Product, Category]`
Boost specific indices with:
`indices_boost: {Category => 2, Product => 1}`
However, I am having difficulty getting this to work properly (it's currently only returning results from two indices when I have 3 listed),
If you ever get a chance to play with this feature with a similar setup could you please let me know whether you get it working properly? Thanks.