How should I model this situation?
I need to store different metrics about various companies. These will be present for every company:
- revenue
- ebitda
- cash
- receivables
- employees
- debt
each one has attributes such as:
- classification - is this the actual #, or an estimated one?
- amount in cents
- the date this value is relevant to
(and some other areas that might not be related)
- who it was entered by
- who it was approved by
However, I also need to allow users to create custom metrics, and capture these other things. For example, someone monitoring this company could want to record monthly users signups of a specific company.
It's safe to assume the majority of this will be time-series data, but not necessarily everything will be.
Here are my questions, starting with most-meta and going more granular:
How do I approach designing the way all this data is stored?
The crux of the question is that this data is sort of fundamentally unstructured, so where should the structure reside?
Are the areas I know I want to store different than the ones I don't, or should I build a way to have anything and then hardcode in the behavior for the metrics I know I'll want to record?
Here's the path my mind leads me down, by table:
class Company.rb < ActiveRecord::Base
has_many :company_metrics
end
class CompanyMetric < ActiveRecord::Base
belongs_to :company
has_many :metric_attributes
end
class Attribute < ActiveRecord::Base
belongs_to :company_metric
belong_to :%(value_name_table)
end
Now these attributes can basically be stored one of several ways:
1) integer (or boolean) (for most everything)
2) bigint (for $$)
3) decimal (for %'s)
4) text
and also, a date will need to be stored at some level
Or is this a situation when I'd want to leverage ducktyping to the extreme? Define a model that serves as the sort of abstract interface to the variety of concrete instances of all the metrics in the db? (sorry for using java terms)
Companies need to be independent of how its metrics are created, composed, and represented. I also want metrics to behave on behalf of only one company.
class Metric
attr_accessor :relevance_date, :value, :value_type
def initialize(company:, metric_name)
#finds company
#grabs appropriate metric from aforementioned structure
#creates metric object
end
end
class Company << ActiveRecord::Base
def metric(metric_name)
Metric.new(self, metric_name)
end
end
This is making assumptions about the structure of the metrics I'll be storing. Ultimately, the way I'll want to use them most of the time is as a point in an array of time-series data so they can turn into graphs, or plug in calculations. So I think expecting stuff to behave like:
metric.relevance_date
metric.value
metric.value_type
Is reasonable.
I think your second post outlines a pretty good storage mechanism. You obviously want to actually make that a database backed model with a company_id
on it so you don't have to make up the metric and initialize methods since ActiveRecord takes care of that for you.
The trouble will be that if you want to actually query these things efficiently, you'll struggle. Each different datatype will be best stored as a string, because you can always convert things to and from the string type. However that means you can't query against these by data. Only the metadata columns will be useful for sorting and filtering the metric entries. There will be a good amount of Ruby processing to convert from string to date, decimal, int, etc.
If you need to, you could store these as separate tables, one for each datatype, so that you can store native datatypes in the database. Obviously, this is a case where a SQL database isn't ideal.