This project will expand the query language of APIv4 to increase functionality. Implementing this will enhance the capabilities of APIv4-based Smart Groups; without this their utility will be limited to simple queries. This is also a prerequisite to the Search Builder UI project.
Tasks:
Implement Having, Group By parameters.
Enable use of SQL Functions like SUM().
Implement explicit joins.
Implement a way to add subquery expressions to an API call.
Sorry, here's some random thoughts. Hope they're welcome.
What about windowing functions ("the most recent contribution that is a donation"; "the date of their first petition signing activity"; "the last 3 mailings")? Out of scope?
Also, there's EXISTS, NOT EXISTS subqueries - perhaps they're included in the subquery bit? But worth noting that there's a lot of potential performance gains in using these (since once the engine finds one record that matches it can stop the rest of the query; where as with a LEFT JOIN it has to produce the whole result set).
Also, on optimisation, if we're starting from API4 I'd want to flag up that starting with the contacts table and joining other queries/tables is often inefficient - esp in the most-common case that you're not actually querying (m)any fields in the contacts table.
Hey @artfulrobot - we definitely want to support as many functions as possible. I'm working on the structure of that now. I'm creating a SqlFunction set of classes that can be extended. So to make APIv4 allow and understand e.g. the SUM function we add a SqlFunctionSUM class with the methods to parse & validate that function. WIll show it to you as soon as I have it put together.
@colemanw sure, makes sense for functions, I was wondering about the structure of the SQL though. I confess to being behind on my understanding of the internals of api4, but from what I've seen it seems to start with an entity (e.g. a contact) and then join in other stuff. Anyway, I'm going to start reading the apiv4 code to get a clearer idea of what's possible.
I've written lots of docs on api4 to help you get started :)
So for the api in general, you can start with any entity as the base table (just like v3).
For the new Api4-based searches, the plan it so support any base table as well.
Ditto for Api4-based smart groups. This first iteration only supports contact as the base table but that isn't a hard limitation; just needs a little more code to help it retrieve a contact_id from other tables.
Thank you for writing those docs, they're well written; I had read them.
So is a search an entity, or an action? Or are you imagining that we use the 'get' as a way to search?
One query I can't do on API 4 is to do with joins where you need control over the ON clause. e.g. If I want a list of people and their not-on-hold email addresses...
SELECT display_name, emailFROM contactLEFT JOIN email ON contact.id = email.contact_id AND email.on_hold = 0
which is different to:
SELECT displayname, emailFROM contactLEFT JOIN email ON contact.id = email.contactid WHERE email.on_hold = 0
Because the first one will select everyone with or without an email - great, I can see who doesn't have an email address and if I see an email address, it's not on hold; whereas the 2nd one will only select people who have an email that is not on hold, so I don't see the others.
Data:
Wilma - has no email
Barney - has one email, on hold.
Fred - has two emails, one on hold one not.
Betty - has two emails, not on hold.
Result from the first SQL: 5 rows:
Wilma
Barney without email
Fred with their not-on-hold email only
Betty: 2 rows one for each email.
Result from the second SQL: 3 rows:
Fred with their not-on-hold email only
Betty: 2 rows one for each email.
I only seem to be able to do the latter query with API4, something like
This is just the first example that came to mind, but it's common to want to query related entities for data that might not match, without ruling out the owning entity from the results.
(Aside: Very excited about this initiative! I've implemented a similar (though far less flexible) system in a different CRM many years ago, and while it generally worked well, support for HAVING/GROUP BY - especially with nesting - was very buggy, so I'm looking forward to seeing the Civi approach to this.)
@pfigel This is not quite the same, still (you're missing Barney).
You can't do everything in a WHERE clause outside the ON, it's a different logical structure. I've come across this quite a lot - being able to filter on the ON is really important. This is what I meant above about the structure of the SQL.
I'm excited about this, too. I recently started on a new search extension (WIP) - based on what I learnt implementing search in another CRM, because there's so much that's just plain difficult to search for in CiviCRM. I'd happily abandon that extension once this work is complete - but clients needed the tool quickly.