31dom/14.md
Mauricio Dinarte ba253739b1 Rename files
2023-08-04 10:43:39 -06:00

187 lines
No EOL
16 KiB
Markdown

# Introduction to paragraphs migrations in Drupal
Today we will present an introduction to **paragraphs** migrations in Drupal. The example consists of migrating paragraphs of one type, then connecting the migrated paragraphs to nodes. A separate image migration is included to demonstrate how they are different. At the end, we will talk about behavior that deletes paragraphs when the host entity is deleted. Let's get started.
## Getting the code
You can get the full code example at <https://github.com/dinarcon/ud_migrations> The module to enable is `UD paragraphs migration introduction` whose machine name is `ud_migrations_paragraph_intro`. It comes with three migrations: `ud_migrations_paragraph_intro_paragraph`, `ud_migrations_paragraph_intro_image`, and `ud_migrations_paragraph_intro_node`. One content type, one paragraph type, and four fields will be created when the module is installed.
The `ud_migrations_paragraph_intro` *only defines the migrations*. But the destination content type, fields, and paragraphs type need to be created as well. That is the job of the `ud_migrations_paragraph_config` module which is a dependency of `ud_migrations_paragraph_intro`. That reason to create the configuration in a separate module is because later articles in the series will make use of the same configuration. So, any example that depends on those content type, fields, and paragraph type can set a dependency on the `ud_migrations_paragraph_config` to make them available.
*Note*: Configuration placed in a module's `config/install` directory will be copied to Drupal's active configuration. And if those files have a `dependencies/enforced/module` key, the configuration will be removed when the listed modules are uninstalled. That is how the content type, the paragraph type, and the fields are automatically created and deleted.
You can get the [Paragraph module](https://www.drupal.org/project/paragraphs) using composer: `composer require drupal/paragraphs`. This will also download its dependency: the [Entity Reference Revisions module](https://www.drupal.org/project/entity_reference_revisions). If your Drupal site is not composer-based, you can get the code for both modules manually.
## Understanding the example set up
The example code creates one *paragraph type* named UD book paragraph (`ud_book_paragraph`). It has two "Text (plain)" *fields*: Title (`field_ud_book_paragraph_title`) and Author (`field_ud_book_paragraph_author`). A new UD Paragraphs (`ud_paragraphs`) content type is also created. This has two *fields*: Image (`field_ud_image`) and Favorite book (`field_ud_favorite_book`) containing references to *images* and *book paragraphs* imported in separate migrations. The words in parenthesis represent the *machine names* of the different elements.
## The paragraph migration
Migrating into a *paragraph type* is very similar to migrating into a *content type*. You specify the *source*, *process* the fields making any required transformation, and set the *destination* entity and bundle. The following code snippet shows the *source*, *process*, and destination *sections*:
```yaml
source:
plugin: embedded_data
data_rows:
- book_id: 'B10'
book_title: 'The definite guide to Drupal 7'
book_author: 'Benjamin Melançon et al.'
- book_id: 'B20'
book_title: 'Understanding Drupal Views'
book_author: 'Carlos Dinarte'
- book_id: 'B30'
book_title: 'Understanding Drupal Migrations'
book_author: 'Mauricio Dinarte'
ids:
book_id:
type: string
process:
field_ud_book_paragraph_title: book_title
field_ud_book_paragraph_author: book_author
destination:
plugin: 'entity_reference_revisions:paragraph'
default_bundle: ud_book_paragraph
```
The most important part of a paragraph migration is setting the *destination plugin* to `entity_reference_revisions:paragraph`. This plugin is actually provided by the Entity Reference Revisions module. It is very important to note that paragraphs entities are **revisioned**. This means that when you want to create a reference to them, you need to provide **two IDs**: `target_id` and `target_revision_id`. Regular entity reference fields like files, images, and taxonomy terms only require the `target_id`. This will be further explained with the node migration.
The other configuration that you can optionally set in the *destination* section is `default_bundle`. The value will be the *machine name* of the *paragraph type* you are migrating into. You can do this when all the paragraphs for a particular migration definition file will be of the same type. If that is not the case, you can leave out the `default_bundle` configuration and add a mapping for the `type` *entity property* in the process section.
You can execute the paragraph migration with this command: `drush migrate:import
ud_migrations_paragraph_intro_paragraph`. After running the migration, there is not much you can do to *verify that it worked*. Contrary to other entities, there is no user interface, available out of the box, that lists all paragraphs in the system. One way to verify if the migration worked is to manually create a [View](https://understanddrupal.com/articles/what-view-drupal-how-do-they-work) that shows paragraphs. Another way is to query the database directly. You can inspect the tables that store the paragraph fields' data. In this example, the tables would be:
- `paragraph__field_ud_book_paragraph_author` for the current author.
- `paragraph__field_ud_book_paragraph_title` for the current title.
- `paragraph_r__8c3a9563ac` for all the author revisions.
- `paragraph_r__3fa7e9863a` for all the title revisions.
Each of those tables contains information about the *bundle* (paragraph type), the *entity id*, the *revision id*, and the migrated *field value*. Table names are derived from the *machine names* of the fields. If they are too long, the field name will be hashed to produce a shorter table name. Having to query the database is not ideal. Unfortunately, the options available to check if a paragraph migration worked are limited at the moment.
## The node migration
The node migration will serve as the *host* for both referenced entities: *images* and *paragraphs*. The image migration is very similar to the one explained in a [previous article](https://understanddrupal.com/articles/migrating-files-and-images-drupal). This time, the focus will be the *paragraph* migration. Both of them are set as dependencies of the node migration, so they need to be executed in advance. The following snippet shows how the *source*, *destinations*, and *dependencies* are set:
```yaml
source:
plugin: embedded_data
data_rows:
- unique_id: 1
name: 'Michele Metts'
photo_file: 'P01'
book_ref: 'B10'
- unique_id: 2
name: 'Benjamin Melançon'
photo_file: 'P02'
book_ref: 'B20'
- unique_id: 3
name: 'Stefan Freudenberg'
photo_file: 'P03'
book_ref: 'B30'
ids:
unique_id:
type: integer
destination:
plugin: 'entity:node'
default_bundle: ud_paragraphs
migration_dependencies:
required:
- ud_migrations_paragraph_intro_image
- ud_migrations_paragraph_intro_paragraph
optional: []
```
Note that `photo_file` and `book_ref` both contain the *unique identifier* of records in the *image* and *paragraph* migrations, respectively. These can be used with the `migration_lookup` plugin to map the reference fields in the nodes to be migrated. `ud_paragraphs` is the *machine name* of the target content type.
The mapping of the image reference field follows the same pattern than the one explained in the article on [migration dependencies](https://understanddrupal.com/articles/introduction-migration-dependencies-drupal). Using the [migration_lookup](https://api.drupal.org/api/drupal/core%21modules%21migrate%21src%21Plugin%21migrate%21process%21MigrationLookup.php/class/MigrationLookup) plugin, you indicate which is the migration that should be searched for the images. You also specify which source column contains the *unique identifiers* that match those in the *image* migration. This operation will return **a single value**: the *file ID* (`fid`) of the image. This value can be assigned to the `target_id` [subfield](https://understanddrupal.com/articles/migrating-data-drupal-subfields) of `field_ud_image` to establish the relationship. The following code snippet shows how to do it:
```yaml
field_ud_image/target_id:
plugin: migration_lookup
migration: ud_migrations_paragraph_intro_image
source: photo_file
```
## Paragraph field mappings
Before diving into the paragraph field mapping, let's think about what needs to be done. Paragraphs are **revisioned entities**. To make a reference to them, you need **two IDs**: their *entity id* and their *entity revision id*. These two values need to be assigned to two [subfields](https://understanddrupal.com/articles/migrating-data-drupal-subfields) of the paragraph reference field: `target_id` and `target_revision_id` respectively. You have to come up with a process pipeline that complies with this requirement. There are many ways to do it, and the specifics will depend on your field configuration. In this example, the paragraph reference field allows an unlimited number of paragraphs to be associated, but only of one type: `ud_book_paragraph`. Another thing to note is that even though the field allows you to add as many paragraphs as you want, *the example migrates exactly one paragraph*.
With those considerations in mind, the mapping of the paragraph field will be a two step process. First, use the `migration_lookup` plugin to get a reference to the paragraph. Second, use the fetched **values** to set the paragraph reference subfields. The following code snippet shows how to do it:
```yaml
pseudo_mbe_book_paragraph:
plugin: migration_lookup
migration: ud_migrations_paragraph_intro_paragraph
source: book_ref
field_ud_favorite_book:
plugin: sub_process
source:
- '@pseudo_mbe_book_paragraph'
process:
target_id: '0'
target_revision_id: '1'
```
The first step is a normal `migration_lookup` procedure. The important difference is that instead of getting a single value, like with images, the paragraph lookup operation will return **an array of two values**. The format is like `[3, 7]` where the `3` represents the *entity id* and the `7` represents the entity *revision id* of the paragraph. Note that **the array keys are not named**. To access those values, you would use the **index** of the elements starting with zero (**0**). This will be important later. The returned array is stored in the `pseudo_mbe_book_paragraph` [pseudofield](https://understanddrupal.com/articles/using-constants-and-pseudofields-data-placeholders-drupal-migration-process-pipeline).
The second step is to set the `target_id` and `target_revision_id` subfields. In this example, `field_ud_favorite_book` is the *machine name* paragraph reference field. Remember that it is configured to accept an arbitrary number of paragraphs, and each will require passing an array of two elements. This means you need to *process an array of arrays*. To do that, you use the [sub_process](https://api.drupal.org/api/drupal/core%21modules%21migrate%21src%21Plugin%21migrate%21process%21SubProcess.php/class/SubProcess) plugin to iterate over *an array of paragraph references*. In this example, the structure to iterate over would be like this:
```
[
[3, 7]
]
```
Let's dissect how to do the mapping of the paragraph reference field. The `source` configuration of the `sub_process` plugin contains *an array of paragraph references*. In the example, that array has a single element: the `'@pseudo_mbe_book_paragraph'` pseudofield. The *quotes* (**'**) and *at sign* (**@**) are required to reuse an element that appears before in the process pipeline. Then, in the `process` configuration, you set the subfields for the paragraph reference field. It is worth noting that at this point you are iterating over a list of paragraph references, even if that list contains only one element. If you had more than one paragraph to migrate, whatever you defined in `process` will apply to all of them.
The `process` configuration is an array of subfield mappings. The left side of the assignment is the **name of the subfield** you want to set. The right side of the assignment is **an array index** of the paragraph reference being processed. Remember that this array *does not have named-keys* so, you use their *numerical index* to refer to them. The example sets the `target_id` subfield to the element in the `0` index and the `target_revision_id` subfield to the element in the one `1` index. Using the example data, this would be `target_id: 3` and `target_revision_id: 7`. *The quotes around the numerical indexes are important*. If not used, the migration will not find the indexes and the paragraphs will not be associated. The end result of this operation will be something like this:
```
'field_ud_favorite_book' => array (1) [
array (2) [
'target_id' => string (1) "3"
'target_revision_id' => string (1) "7"
]
]
```
There are three ways to run the migrations: manually, [executing dependencies](https://understanddrupal.com/articles/introduction-migration-dependencies-drupal), and [using tags](https://understanddrupal.com/articles/introduction-migration-dependencies-drupal). The following code snippet shows the three options:
```console
# 1) Manually.
$ drush migrate:import ud_migrations_paragraph_intro_image
$ drush migrate:import ud_migrations_paragraph_intro_paragraph
$ drush migrate:import ud_migrations_paragraph_intro_node
# 2) Executing depenpencies.
$ drush migrate:import ud_migrations_paragraph_intro_node --execute-dependencies
# 3) Using tags.
$ drush migrate:import --tag='UD Paragraphs Intro'
```
And that is *one way* to map paragraph reference fields. In the end, all you have to do is set the `target_id` and `target_revision_id` subfields. The process pipeline that gets you to that point can vary depending on how your paragraphs are configured. The following is a non-exhaustive list of things to consider when migrating paragraphs:
- How many paragraphs types can be referenced?
- How many paragraphs instances are being migrated? Is this a multivalue field?
- Do paragraphs have translations?
- Do paragraphs have revisions?
## Do migrated paragraphs disappear upon node rollback?
Paragraphs migrations are affected by a particular behavior of *revisioned entities*. If the host entity is deleted, and the paragraphs do not have translations, the whole paragraph gets deleted. That means that deleting a node will make the referenced paragraphs' data to be removed. How does this affect your [migration workflow](https://understanddrupal.com/articles/tips-writing-drupal-migrations-and-understanding-their-workflow)? If the migration of the host entity is rollback, then the paragraphs will be removed, the migrate API will not know about it. In this example, if you run a migrate status command after rolling back the *node* migration, you will see that the *paragraph* migration indicated that there are no pending elements to process. The *file* migration for the images will report the same, but in that case, the images will remain on the system.
In any migration project, it is common that you do rollback operations to test new field mappings or fix errors. Thus, chances are very high that you will stumble upon this behavior. Thanks to [Damien McKenna](https://www.drupal.org/u/damienmckenna) for helping me understand this behavior and tracking it to the [rollback](https://git.drupalcode.org/project/entity_reference_revisions/blob/8.x-1.x/src/Plugin/migrate/destination/EntityReferenceRevisions.php#L150) method of the [EntityReferenceRevisions](https://git.drupalcode.org/project/entity_reference_revisions/blob/8.x-1.x/src/Plugin/migrate/destination/EntityReferenceRevisions.php) destination plugin. So, what do you do to recover the deleted paragraphs? You have to rollback *both* migrations: *node* and *paragraph*. And then, you have to import the two again. The following snippet shows how to do it:
```console
# 1) Rollback both migrations.
$ drush migrate:rollback ud_migrations_paragraph_intro_node
$ drush migrate:rollback ud_migrations_paragraph_intro_paragraph
# 2) Import both migrations againg.
$ drush migrate:import ud_migrations_paragraph_intro_paragraph
$ drush migrate:import ud_migrations_paragraph_intro_node
```
What did you learn in today's blog post? Have you migrated paragraphs before? If so, what challenges have you found? Did you know paragraph reference fields require two subfields to be set? Did you that deleting the host entity also deletes referenced paragraphs? Please share your answers in the comments. Also, I would be grateful if you shared this blog post with others.