API Design Decisions
I'm designing & building an API for a company and have arrived at a decision point I can't figure out.
we're providing access to records. right now they want to quickly integrate with one client, but they also will want to do more in the future. there's no updating, no index, just requests about and for parts of individual records
Since we know they're gonna want to integrate with more than one person, we added in access tokens that we manage and provide to them so we can know who is requesting what.
Now here's the tricky part: certain areas of these records are large cat pictures, and take time to retrieve from the backend they're stored in. There are also large dog pictures, and large trainer certificates. Beyond these few stable things, all the other data being recorded for a record is different, changing over time, and we can't reliably count on it being there. This includes stuff like more info about the cats, dogs, trainers, and where the training took place.
I want to make the api backwards compatible and avoid introducing any breaking versions as much as I can. Because of the uncertainty with the data that gets stored, my predisposition is to sweep all of the complexity behind the ? and deal with it there.
Thing is, I'm pretty sure the dog, cat and certificate pictures will always be there. Since they're big and take time, we're only gonna send them over when they're specifically requested.
However, some of these things are stable. So maybe it would be better to go completely restful and have the api set up like this:
And then keep the uncertain stuff like the trainer certificate in a 'fields' parameter, and maybe promote it when we can guarantee it.
The fundamental question I have is: what's nicer to work with? Requesting through a fields parameter, or breaking up what would be the fields thing and turning it more RESTful?
Which way would you rather use, and why?
Since you're basically filtering the results, this makes more sense as a query param than an endpoint (in general).
I don't think either approach is necessarily "more RESTful" than the other. It depends on the use case of this stuff. If the dog and cat are truly separate objects with their own sub-fields, then it might make sense to have their own route if there is a use case for retrieving that with nothing else. On the flip side, if cat is just one field, then making an endpoint just for this doesn't make sense and isn't actually RESTful because it's only a piece of a larger object.
The fields parameter is straightforward. If you have fields you always want to return, you could have an
include parameter that they can include additional fields that aren't included by default. This would let them specify only the extras they need rather than having to specify every single field each request. That may not be what they need, so specifying individual fields might make more sense. That's up to you. In general, the right answer is going to come from your understanding of how this will be consumed.
How are you handling the slow responses from the backend with the image storage?
Or perhaps totally changing all of that:
My preference is to keep the urls very short and clear, and sweep the complexity behind the ?
But if everything gets set up more like , its easier to rely upon Rails to just handle it
Well, the pictures are aactually, protected health information, there are a buttload of them, and we won't know what ones will be requested. So our only real way of doing this is just requesting them from the place where they're stored when they're requested.
We're going to return the easy-to-retrieve information about a record by default, and then if they want the dog picture, they can get that through 'fields.' But 'include' is a lot more descriptive
I guess my question is actually more related to performance. How are you handling the slow responses for those records? Are you just providing a really slow response to an API request? Are you sending webhooks at a later time with the result?
Right now we're just providing a really slow response since caching the images is out of the question for the moment.
Gotcha. I guess if the responses are super slow because of that, then having separate resources for them makes sense so only certain requests are slow and you'll be able to load the regular data for the primary record quickly.
I wouldn't put the token in the url because it's not a resource itself.
One benefit of having the
fields in the query is that your API endpoint will not change in the future. You can always include more data simply by adding options. If you have
/resource/cat, etc your clients will need to be aware when these are deprecated, move, or new things are added. It's much less flexible than having a configurable
includes query param.
Been reading up on this thread which is quite interesting. I'm also not a fan of putting the token in the url. It introduces potential security issues especially on a network that might get hit with Man in the middle attacks. All they need to do is sniff the auth token and boom it's done. I have a lot of experience with HIPAA and PHI so your assets that are PHI is something you'll definitely want to secure. But to serve up the assets faster you couch "technically" store them on a CDN that meets HIPAA requirements. That's something I had to do in one of my apps for large imaging/MRI records.