Track Contact `image_URL` files in the `civicrm_file` table
History
Note: CiviCRM is a large, old project with many contributors, which makes "big picture" perspectives cumbersome to gather. Often a contributor simply wants to fix a bug or add a small feature, not dive into decades-long history and contemplate massive refactoring. The story of the civicrm_contact.image_URL
field is a microcosm of the complicated world of CiviCRM.
- The
civicrm_contact.image_URL
field (added v1.1) predates thecivicrm_file
table (added v1.5), which may explain why it was originally designed as a simple textfield with no file-management. It is a simple varchar and can store the url to any image file on the web. - Originally, the UI allowed contact image files to be uploaded to a publicly-readable directory on the webserver, and
image_URL
stored an absolute url to that file. - In 2014, security hardening led to the addition of an
.htaccess
rule which blocked the contents of that directory from public visibility. - This accidentally broke contact images, leading to a rushed fix which created
CRM_Contact_Page_ImageFile
atcivicrm/contact/imagefile
. This path allows open access to all contact images, but not other files, by queryingcivicrm_contact.image_URL
for a matching filename before outputting the contents. An upgrade script rewrote all localimage_URL
fields to point to this path, using an absolute URL. This solution works but is very slow on big databases due to the unindexed query. - During this rushed fix, it doesn't appear that consideration was given to html escaping of
&
characters. The default behavior ofCRM_Utils_System::url
is to escape&
to&
which is (IMHO) a bad default and certainly a poor choice for storing url strings in the database, but it was the default and no one changed it, and then Tim inadvertently cemented it with 06503451 so now in order to safely read the urls you must pass them through the aptly-namedCRM_Utils_String::unstupifyUrl()
function. - In 2016 there was an attempt to fix the absolute URL and performance issues which updated all the
image_URL
fields to relative paths. - But this would have broken Drupal Views and other tools that rely on the convenience of querying the field from the db and outputting the url directly (in the future, SearchKit would rely on this too).
- A compromise solution was reached which rewrites the url at runtime on Civi pages. A
image_URL
likehttp://wp.demo/civicrm/contact/imagefile/?photo=abc.jpeg
would get rewritten tohttp://wp.demo/civicrm/file?reset=1&filename=abc.jpeg&mime-type=image/jpeg
. For reasons not entirely clear, this goes through a different internal path (civicrm/file
instead ofcivicrm/contact/imagefile
). - In 2019, thinking it was unused, Eileen removed support for passing
filename=
into thecivicrm/file
path, since that endpoint is typically supplied with a file id from thecivicrm_file
table plus a security hash. - This accidentally broke contact images, leading to a rushed fix which added back the ability to get files by name from the
civicrm/file
path (since contact images are not tracked incivicrm_file
and don't have anid
). With the benefit of the history laid out above, a better fix might have been to switch to using thecivicrm/contact/imagefile
path and keep thecivicrm/file
path secure. - In 2021 I proposed adding an
image_file_id
FK field to thecivicrm_contact
table to track uploaded files. This proposal was met with approval, but when I recently tried to implement it I realized that thecivicrm_file
table already has an FK tocivicrm_contact
and circular references are not allowed. - In 2023 an option group was added for the previously unused column
civicrm_file.file_type_id
. One possible use for that field would be to designate a file type of"contact_image"
.
Current Situation
The civicrm_contact.image_URL
field can still store any url string pointing to a file on or off the server. It could point to any image on the internet, and would work fine. But if it's a file uploaded via the Civi UI, it will be an absolute link pointing to the civicrm/contact/imagefile
path with a photo=filename
argument. If CiviCRM recognizes this pattern it will rewrite it on core pages to the other path at civicrm/file
, otherwise it will leave it alone.
For the confused, yes contact images are accessible at two paths, and neither is a direct link to the file on disk:
Path | civicrm/contact/imagefile |
civicrm/file |
---|---|---|
Class | CRM_Contact_Page_ImageFile |
CRM_Core_Page_File |
Permission | none | "access uploaded files" |
Args | photo |
filename , mime-type
|
Uses | Stored in image_URL field as absolute URL. Output by Views & SearchKit |
image_URL rewritten to this path on Civi pages for logged-in users |
This situation leads to the following quirks and problems
- The absolute url to
civicrm/contact/imagefile
works great in Views and SearchKit... as long as the site name never changes! Otherwise, absolute URLs are a pain. - The url is still stored with html-escaped
&
characters that must be unstupified. - Anyone can access a contact image via the 1st path if they know the filename, however, the 2nd has a permission check which means logged-in users without "access uploaded files" cannot see contact images even though anonymous users can!
- The security hash usually required by
civicrm/file
can be circumvented if you know the filename and mime-type. But the risk is mitigated by that path requiring "access uploaded files" permission. - There is still no file-management of contact images. Deleting a contact does not delete their image file. Deleting or changing a contact image also doesn't delete the old one.
Proposal for File Management
- Stop using
civicrm/file
for all contact images and restore the patch to remove support forfilename
. - Include
cid
as an argument tocivicrm/contact/imagefile
(and update stored paths accordingly) to fix the unindexed query. Also addis_deleted = 0
to the query. - Add an option_value
"contact_image"
to the option group forcivicrm_file.file_type_id
. - When uploading a new contact image, create a record in
civicrm_file
table, and designate itfile_type_id
="contact_image"
. - Also create a record in
civicrm_entity_file
for contact images. - Add a virtual APIv4 field for the contact entity
image_file_id
which would allow getting/setting the file id. When setting a new file id, regenerate theimage_URL
with a post hook.
Thoughts on Absolute URLs
All of these changes would result in better file management, but still doesn't solve the absolute url issue. This is tricky to solve because Views and SearchKit still rely on being able to output the image url directly from a query. Here are a few ideas for that one:
- Bite the bullet and update all
image_URL
fields pointing to a local file to use a relative URL. SearchKit will still work. Views and other SQL-based tools will still work unless embedded on a remote site. Random offsite images will be unaffected. - Keep
image_URL
absolute but add an APIv4 virtual field likeContact.image
which calculates the url at runtime. This satisfies moree use-cases but at the expense of adding complexity to an already overcomplicated situation.