Task #10306

Improve foreign key declarations in database schema

Added by José Raddaoui Marín about 4 years ago. Updated 4 months ago.

Status:VerifiedStart date:09/14/2016
Priority:MediumDue date:
Assignee:-% Done:

0%

Category:Data model / ORM
Target version:Release 2.6.0
Google Code Legacy ID: Tested version:
Sponsored:No Requires documentation:Yes

Description

Some foreign key constraints in AtoM's database do not include an ON DELETE action, which defaults to RESTRICT the operation in MySQL. This may be useful in some cases (parent/child relation) but it's a limitation in others (delete a term with a SQL query). To work around this limitation, AtoM's ORM updates/deletes the related resources before a deletion and changing the action the action to SET NULL may require code changes in some places, but we should investigate if we can improve some of this constraints.

Foreign keys missing ON DELETE and suggestions (qa/2.6.x - 6fe488a):

  • actor:
    - parent_id (ideally remove, notes below)
  • digital_object:
    - parent_id (CASCADE, notes below)
  • event:
    - actor_id (SET NULL, but keep this logic)
  • function_object:
    - type_id (SET NULL, currently not required in edit form so it can be null already)
    - parent_id (ideally remove, see #13237)
    - description_status_id (SET NULL, not required in edit form)
    - description_detail_id (SET NULL, not required in edit form)
  • information_object:
    - collection_type_id (SET NULL, only from some XML imports, is this used?)
    - repository (SET NULL, we do it manually)
    - parent_id (CASCADE, notes below)
  • note:
    - user_id (SET NULL)
  • physical_object:
    - parent_id (ideally remove, notes below)
  • premis_object
    - information_object_id (CASCADE)
  • relation:
    - subject_id (CASCADE, required in the schema)
    - object_id (CASCADE, required in the schema)
    - type_id (CASCADE, not required at DB level but, can we have relations without type?)
  • taxonomy:
    - parent_id (ideally remove, notes below)
  • term:
    - parent_id (CASCADE, notes below)

About the parent_id:

In some models we don't really that relation and it could be removed alongside the nested set implementation (actor, function_object, physical_object and taxonomy, see #13237). In other models the action is already CASCADE and deleting a top-level will delete all descendants (menu and acl_group).

For the digital_object, using CASCADE will delete the reference and thumbnail DOs when a master is deleted. It will require code changes to clean-up the assets properly.

The remaining two (information_object and term) involve nested set and Elasticsearch updates but they could use CASCADE too. Currently, when an IO/term is deleted, the entire tree is traversed from the bottom to the top using the ORM to delete descendants first and update ES. We recently improved this process for IOs, updating the nested set only at the end (#13211) and we have plans to do it for terms (#13239). Using CASCADE will avoid the ORM and we'll only need to update the nested set (similar to how it's currently done for IOs) and Elasticsearch (adding the ancestor ids to the term documents like we do for IOs and using ES' delete by query API in both models).


Related issues

Related to Access to Memory (AtoM) - Task #13239: Improve deletion of nested set hierarchies (specially terms) Verified 01/13/2020
Related to Access to Memory (AtoM) - Bug #13211: The deletion of archival descriptions with a lot of desce... Verified 11/13/2019
Related to Access to Memory (AtoM) - Task #13237: Reconsider nested set implementation in some models Verified 01/13/2020
Related to Access to Memory (AtoM) - Bug #13350: Reduce the likelihood of issues in upgrades from 1.x to 2... Verified 06/11/2020
Related to Access to Memory (AtoM) - Bug #13353: Attempting to delete Term that is used in existing Relati... New 06/15/2020

History

#2 Updated by José Raddaoui Marín 6 months ago

  • Subject changed from Add "on delete" clause to foreign keys where it's missing to Improve foreign key declarations in database schema
  • Description updated (diff)

#3 Updated by José Raddaoui Marín 6 months ago

  • Related to Task #13239: Improve deletion of nested set hierarchies (specially terms) added

#4 Updated by José Raddaoui Marín 6 months ago

  • Related to Bug #13211: The deletion of archival descriptions with a lot of descendants is prone to web server timeouts added

#5 Updated by José Raddaoui Marín 6 months ago

  • Related to Task #13237: Reconsider nested set implementation in some models added

#6 Updated by José Raddaoui Marín 5 months ago

  • Status changed from New to Code Review
  • Assignee set to Steve Breker
  • Target version set to Release 2.6.0

Addressed in https://github.com/artefactual/atom/pull/1095 and ready for code review.

Current changes included in that PR in relation to this issue:

Add missing on delete actions to:
- actor:
  - parent_id: CASCADE
- digital_object:
  - parent_id: CASCADE
- event:
  - actor_id: SET NULL
- function_object:
  - type_id: SET NULL
  - description_status_id: SET NULL
  - description_detail_id: SET NULL
- information_object:
  - collection_type_id: SET NULL
  - repository: SET NULL
  - parent_id: CASCADE
- note:
  - user_id: SET NULL
- premis_object
  - information_object_id: CASCADE
- relation:
  - subject_id: CASCADE
  - object_id: CASCADE
  - type_id: CASCADE
- taxonomy:
  - parent_id: CASCADE
- term:
  - parent_id: CASCADE

Changes:
- Modify schema.
- Build SQL.
- Build models.
- Add migration:
  - Normalize update indexes and foreign keys functions in QubitMigrate.
  - Use them from migrations: 177, 181 and 182.
  - Catch exception updating foreign key to reformat message with
    more information about the failed operation.
- Bump version.

#7 Updated by José Raddaoui Marín 5 months ago

  • Status changed from Code Review to QA/Review
  • Assignee changed from Steve Breker to Dan Gillean

Merged in qa/2.6.x.

For testing, I'd suggest to test the upgrade (with as many databases as possible) and the install processes. Also, some CRUD operations in the entities mentioned in the previous update.

#8 Updated by José Raddaoui Marín 5 months ago

  • Requires documentation set to Yes

Marked as requires documentation to modify the upgrade docs with comments about this integrity improvements and why they may cause errors in the upgrade process.

#9 Updated by José Raddaoui Marín 5 months ago

Hi Dan, just thinking that you'll probably face the same issue I did upgrading the demo data. If you get the following error ...

Could not alter foreign key for 'subject_id' column on 'relation' table.  

SQLSTATE[23000]: Integrity constraint violation: 1216 Cannot add or update a child row: a foreign key constraint fails

Restore the backup and run the following SQL query before re-trying the upgrade:

DELETE relation FROM relation LEFT JOIN object ON subject_id=object.id WHERE object.id IS NULL;

#10 Updated by José Raddaoui Marín 5 months ago

In the end we already have a comment about the criticality of the upgrade task and links to the forum and FAQ pages, so I've only added a couple of notes to run the most demanding tasks without memory limit:

https://github.com/artefactual/atom-docs/pull/148

We still need to consider if we want to extend that comment for the possible issues of this upgrade or improve the migration process to avoid errors like the one from the previous update. After chatting with Dan about this, I'm more in favor of the later.

#11 Updated by Dan Gillean 5 months ago

Once I manually fixed the demo data DB as described above in note-9, AtoM seems to be working as expected, and deletions are working throughout the application. Waiting to resolve this issue until we add the attempts to address known issues in the schema migrations.

#12 Updated by José Raddaoui Marín 4 months ago

  • Status changed from QA/Review to Code Review
  • Assignee changed from Dan Gillean to Steve Breker

#13 Updated by José Raddaoui Marín 4 months ago

  • Status changed from Code Review to QA/Review
  • Assignee changed from Steve Breker to José Raddaoui Marín

With the PR above the issues like the one described in update 9 should no longer happen. I'll test the upgrade process locally with other databases before passing this for final QA/Review.

#14 Updated by Dan Gillean 4 months ago

Great! I successfully ran my old vagrant script with the old sqldump - meaning that a ~2.3 database with the demo data is currently upgrading successfully. I'll wait for your additional tests and let you mark the ticket as verified when you're ready, but so far this is looking good!

#15 Updated by José Raddaoui Marín 4 months ago

  • Status changed from QA/Review to Verified
  • Assignee deleted (José Raddaoui Marín)

#17 Updated by José Raddaoui Marín 4 months ago

Tested with 2.x databases and all went well (in fact, the latest issue fixed above was only present in the demo data DB). However, I've found a few issues upgrading a 1.x database (reported as a new issue in #13350).

#18 Updated by José Raddaoui Marín 4 months ago

  • Related to Bug #13350: Reduce the likelihood of issues in upgrades from 1.x to 2.6.x added

#19 Updated by José Raddaoui Marín 4 months ago

  • Related to Bug #13353: Attempting to delete Term that is used in existing Relationships causes 500 server error and prematurely removes term from Elasticsearch added

Also available in: Atom PDF