31dom/09.md

125 lines
9.9 KiB
Markdown
Raw Permalink Normal View History

2020-10-04 05:01:10 +00:00
# Migrating taxonomy terms and multivalue fields into Drupal
Today we continue the conversation about migration dependencies with a **hierarchical taxonomy terms** example. Along the way, we will present the process and syntax for migrating into **multivalue fields**. The example consists of two separate migrations. One to import taxonomy terms accounting for term hierarchy. And another to import into a multivalue taxonomy term field. Following this approach, any node and taxonomy term created by the migration process will be removed from the system upon rollback.
## Getting the code
You can get the full code example at https://github.com/dinarcon/ud_migrations The module to enable is `UD multivalue taxonomy terms` whose machine name is `ud_migrations_multivalue_terms`. The two migrations to execute are `udm_dependencies_multivalue_term` and `udm_dependencies_multivalue_node`. Notice that both migrations belong to the same module.
The example assumes Drupal was installed using the `standard` installation profile. Particularly, a Tags (`tags`) taxonomy vocabulary, an Article (`article`) content type, and a Tags (`field_tags`) field that accepts multiple values. The words in parenthesis represent the machine name of each element.
## Migrating taxonomy terms and their hierarchy
The example data for the taxonomy terms migration is fruits and fruit varieties. Each row will contain the name and description of the fruit. Additionally, it is possible to define a parent term to establish hierarchy. For example, "Red grape" is a child of "Grape". Note that no _numerical identifier_ is provided. Instead, the value of the `name` is used as a `string` _identifier_ for the migration. If term names could change over time, it is recommended to have another column that did not change (e.g., an autoincrementing number). The following snippet shows how the _source_ section is configured:
```yaml
source:
plugin: embedded_data
data_rows:
- fruit_name: "Grape"
fruit_description: "Eat fresh or prepare some jelly."
- fruit_name: "Red grape"
fruit_description: "Sweet grape"
fruit_parent: "Grape"
- fruit_name: "Pear"
fruit_description: "Eat fresh or prepare a jam."
ids:
fruit_name:
type: string
```
The destination is quite short. The target entity is set to _taxonomy terms_. Additionally, you indicate which _vocabulary_ to migrate into. If you have terms that would be stored in different vocabularies, you can use the `vid` property in the process section to assign the target vocabulary. If you write to a single one, the `default_bundle` key in the destination can be used instead. The following snippet shows how the _destination_ section is configured:
```yaml
destination:
plugin: "entity:taxonomy_term"
default_bundle: tags
```
For the _process_ section, three entity properties are set: _name_, _description_, and _parent_. The first two are strings copied directly from the source. In the case of `parent`, it is an _entity reference_ to another _taxonomy term_. It stores the **taxonomy term id** (`tid`) of the _parent_ term. To assign its value, the `migration_lookup` plugin is configured similar to the example in the previous chapter. The difference is that, in this case, the migration to reference is the same one being defined. This sets an important consideration. _Parent terms should be migrated before their children_. This way, they can be found by the look up operation. Also note that the look up value is the term name itself, because that is what this migration set as the _unique identifier_ in the _source_ section. The following snippet shows how the _process_ section is configured:
```yaml
process:
name: fruit_name
description: fruit_description
parent:
plugin: migration_lookup
migration: udm_dependencies_multivalue_term
source: fruit_parent
```
2020-10-04 18:59:49 +00:00
_Technical note_: The _taxonomy term_ entity contains other properties you can write to. For a list of available options check the `baseFieldDefinitions` method of the `Term` class defining the entity. Note that more properties can be available up in the class hierarchy.
2020-10-04 05:01:10 +00:00
## Migrating multivalue taxonomy terms fields
The next step is to create a _node_ migration that can write to a _multivalue taxonomy term field_. To stay on point, only one more field will be set: the _title_, which is required by the _node_ entity.[^9-change] The following snippet shows how the _source_ section is configured for the _node_ migration:
[^9-change]: Read this change record for more information on how the Migrate API processes Entity API validation: <https://www.drupal.org/node/3073707>
```yaml
source:
plugin: embedded_data
data_rows:
- unique_id: 1
thoughtful_title: "Amazing recipe"
fruit_list: "Green apple, Banana, Pear"
- unique_id: 2
thoughtful_title: "Fruit-less recipe"
ids:
unique_id:
type: integer
```
The `fruits` column contains a comma separated list of taxonomies to apply. Note that the values match the _unique identifiers_ of the _taxonomy term migration_. If you had used numbers as migration identifiers there, you would have to use those numbers in this migration to refer to the terms. An example of that was presented in the previous chapter. Also note that there is one record that has no terms associated. This will be considered during the field mapping. The following snippet shows how the _process_ section is configured for the _node_ migration:
```yaml
process:
title: thoughtful_title
field_tags:
- plugin: skip_on_empty
source: fruit_list
method: process
message: "Row does not contain fruit_list."
- plugin: explode
delimiter: ","
- plugin: callback
callable: trim
- plugin: migration_lookup
migration: udm_dependencies_multivalue_term
no_stub: true
```
The _title_ of the _node_ is a verbatim copy of the `thoughtful_title` column. The _Tags_ fields, mapped using its machine name `field_tags`, uses three chained process plugins. The `skip_on_empty` plugin reads the value of the `fruit_list` column and skips the processing of this field if no value is provided. This is done to accommodate the fact that some records in the _source_ do not specify tags. Note that the `method` configuration key is set to `process`. This indicates that only this field should be skipped and not the entire record. Ultimately, tags are optional in this context and _nodes_ should still be imported even if _no tag is associated_.
2020-10-04 18:54:16 +00:00
The [explode](https://api.drupal.org/api/drupal/core%21modules%21migrate%21src%21Plugin%21migrate%21process%21Explode.php/class/Explode) plugin allows you to break a string into an _array_, using a `delimiter` to determine where to make the cut. Later, the `callback` plugin will use the `trim` PHP function to remove any space from the start or end of the exploded taxonomy term name. Finally, this _array_ is passed to the `migration_lookup` plugin specifying the term migration as the one to use for the look up operation. Again, the taxonomy term names are used here because they are the _unique identifiers_ of the _term migration_. The `no_stub` configuration should be set to `true` to prevent terms to be created if they are not found by the plugin. This would not occur in the example because we make sure a match is found. If we did not set this configuration and we do not include the trim step, some new terms would be created with spaces at the beginning. Note that neither of these plugins has a `source` configuration. This is because when process plugins are chained, the result of one plugin is sent as source to be transformed by the next one in line. The end result is an _array_ of **taxonomy term ids** that will be assigned to `field_tags`. The `migration_lookup` is able to process _single values_ and _arrays_.
2020-10-04 05:01:10 +00:00
The last part of the migration is specifying the _destination_ section and any _dependencies_. The following snippet shows how both are configured for the node migration:
```yaml
destination:
plugin: "entity:node"
default_bundle: article
migration_dependencies:
required:
- udm_dependencies_multivalue_term
optional: []
```
## More syntactic sugar
One way to set multivalue fields in Drupal migrations is assigning its value to an _array_. Another option is to set each value manually using **field deltas**. _Deltas_ are integer numbers starting with zero (**0**) and incrementing by one (**1**) for each element of a multivalue field. Although you could set any delta in the Migrate API, consider the field definition in Drupal. It is possible that limits had been set to the number of values a field could hold. You can specify _deltas_ and _subfields_ at the same time. The full syntax is `field_name/field_delta/subfield`. The following example shows the syntax for a multivalue image field:
```yaml
process:
field_photos/0/target_id: source_fid_first
field_photos/0/alt: source_alt_first
field_photos/1/target_id: source_fid_second
field_photos/1/alt: source_alt_second
field_photos/2/target_id: source_fid_third
field_photos/2/alt: source_alt_third
```
2020-10-04 18:50:50 +00:00
Manually setting a multivalue field is less flexible and error-prone. In today's example, we showed how to accommodate for the list of terms not being provided. Imagine having to that for each _delta_ and _subfield_ combination, but the functionality is there in case you need it. In the end, Drupal offers more _syntactic sugar_ so you can write shorted field mappings. Additionally, there are various process plugins that can handle _arrays_ for setting multivalue fields.
2020-10-04 05:01:10 +00:00
2020-10-04 18:54:16 +00:00
_Note_: There are other ways to migrate multivalue fields. For example, when using the [entity_generate](https://git.drupalcode.org/project/migrate_plus/blob/HEAD/src/Plugin/migrate/process/EntityGenerate.php) plugin provided by Migrate Plus, there is no need to create a separate taxonomy term migration. This plugin is able to create the terms on the fly while running the import process. The caveat is that terms created this way are not deleted upon rollback.