31dom/04.md

126 lines
11 KiB
Markdown
Raw Normal View History

2020-10-04 05:01:10 +00:00
# Migrating data into Drupal subfields
In the previous chapter, we learned how to use process plugins to transform data between source and destination. Some Drupal fields have multiple components. For example, formatted text fields store the text to display and the text format to apply. Image fields store a reference to the file, alternative and title text, width, and height. The migrate API refers to a field's component as **subfield**. In this chapter we will learn how to migrate into them and know which subfields are available.
2023-08-05 11:25:40 +00:00
Today's example will consist of migrating data into the `Body` and `Image` fields of the `Article` content type is provided by the `standard` installation profile. As in previous examples, we will create a new module and write a migration plugin. The code snippets will be compact to focus on particular elements of the migration. The full code is available at <https://www.drupal.org/project/migrate_examples> The module name is `Migration Subfields Example` and its machine name is `subfields_example`. This example uses the [Migrate Files](https://www.drupal.org/project/migrate_file) module (explained later). Make sure to download and enable it. Otherwise, you will get an error like: `In DiscoveryTrait.php line 53: The "file_import" plugin does not exist. Valid plugin IDs for Drupal\migrate\Plugin\MigratePluginManager are: ...`. Let's see part of the *source* definition:
2020-10-04 05:01:10 +00:00
```yaml
source:
plugin: embedded_data
data_rows:
-
unique_id: 1
2023-08-05 11:25:40 +00:00
name: Micky Metts
profile: <a href="https://www.drupal.org/u/freescholar" title="Micky on Drupal.org">freescholar</a> on Drupal.org
photo_url: https://udrupal.com/photos/freescholar.jpg
photo_description: Photo of Micky Metts
photo_width: 587
photo_height: 657
2020-10-04 05:01:10 +00:00
```
Only one record is presented to keep snippet short, but more exist. In addition to having a unique identifier, each record includes a name, a short profile, and details about the image.
## Migrating formatted text
2023-08-05 11:25:40 +00:00
The `Body` field is of type `Text (formatted, long, with summary)`. This type of field has three components: the text *value* to present, a *summary* text, and the text *format* to use. The Migrate API allows you to write to each component separately defining subfields targets.
2020-10-04 05:01:10 +00:00
```yaml
process:
field_text_with_summary/value: source_value
field_text_with_summary/summary: source_summary
field_text_with_summary/format: source_format
```
2023-08-05 11:25:40 +00:00
The syntax to migrate into subfields is the machine name of the field and the subfield name separated by a *slash* (**/**). Then, a *colon* (**:**), a *space*, and the *value* to assign. You can set the value to a source field name for a verbatim copy or use any combination of process plugins in a chain. It is not required to migrate into all subfields. Each field determines what components are required so it is possible that not all subfields are set. In this example, only the value and text format will be set.
2020-10-04 05:01:10 +00:00
```yaml
process:
body/value: profile
body/format:
plugin: default_value
default_value: restricted_html
```
2023-08-05 11:25:40 +00:00
The `value` subfield is set to the `profile` source field. As you can see in the first snippet, it contains HTML markup. An `a` tag to be precise. Because we want the tag to be rendered as a link, a text format that allows such tag needs to be specified. There is no information about text formats in the source, but the `standard` installation of Drupal comes with a couple we can choose from. In this case, we use the `Restricted HTML` text format. The `default_value` plugin is used and set to `restricted_html`. When setting text formats, it is necessary to use their machine name. You can find them in the configuration page for each text format. For `Restricted HTML` that is /admin/config/content/formats/manage/restricted_html.
2020-10-04 05:01:10 +00:00
2023-08-05 11:25:40 +00:00
*Note*: Text formats are a whole different subject that even has security implications. To stay topic, we will only give some recommendations. When you need to migrate HTML markup, you need to know which tags appear in your source and which ones you want to allow in Drupal. Then, select a text format that accepts what you have allowed and filters out any dangerous tag like `script`. As a general rule, you should avoid setting the `format` subfield to use the `Full HTML` text format.
2020-10-04 05:01:10 +00:00
## Migrating images
2023-08-05 11:25:40 +00:00
There are [different approaches to migrating images](https://www.drupal.org/docs/8/api/migrate-api/migrate-destination-plugins-examples/migrating-files-and-images). In this example we use the Migrate Files module. It is important to note that Drupal treats images as files with extra properties and behavior. Any approach used to migrate files can be adapted to migrate images.
2020-10-04 05:01:10 +00:00
```yaml
process:
field_image/target_id:
plugin: file_import
source: photo_url
2023-08-05 11:25:40 +00:00
file_exists: rename
2020-10-04 05:01:10 +00:00
id_only: TRUE
field_image/alt: photo_description
field_image/title: photo_description
field_image/width: photo_width
field_image/height: photo_height
```
When migrating any field, you have to use their *machine name* in the mapping section. For the `Image` field, the machine name is `field_image`. Knowing that, you set each of its subfields:
* `target_id` stores an integer number which Drupal uses as a reference to the file.
* `alt` stores a string that represents the alternative text. Always set one for better accessibility.
* `title` stores a string that represents the title attribute.
* `width` stores an integer number which represents the width in pixels.
* `height` stores an integer number which represents the height in pixels.
2023-08-05 11:25:40 +00:00
For the `target_id`, the plugin `file_import` is used. This plugin requires a `source` configuration value with a url to the file. In this case, the `photo_url` field from the *source* section is used. The `file_exists` configuration dictates what to do in case a file with the same name already exists. Valid options are `replace` to replace the existing file, `use existing` to reuse the file, and `rename` to append `_N` to the file name (where `N` is an incrementing number) until the filename is unique. When working on migrations, it is common to run them over and over until you get the expected results. Using the `use existing` option will avoid downloading multiple copies of image file. The `id_only` flag is set so that the plugin only returns that file identifier used by Drupal instead of an entity reference array. This is done because each subfield is being set manually. For the rest of the subfields (`alt`, `title`, `width`, and `height`) the value is a verbatim copy from the *source*.
2023-08-05 14:00:35 +00:00
!!! use existing vs replace
2023-08-05 11:25:40 +00:00
*Note*: The Migrate Files module offers another plugin named `image_import`. That one allows you to set all the subfields as part of the plugin configuration. An example of its use will be shown in the chapter !!!. This example uses the `file_import` plugin to emphasize the configuration of the image subfields.
2020-10-04 05:01:10 +00:00
## Which subfields are available?
2023-08-05 14:00:35 +00:00
Some fields have many subfields. [Address fields](https://www.drupal.org/project/address), for example, have 14 subfields. How can you know which ones are available? You can look for an !!!online reference or search for the information yourself by reviewing Drupal's source code. The subfields are defined in the class that provides the field type. Once you find the class, look for the `schema` method. The subfields are contained in the `columns` array of the value returned by that method. Let's see some examples:
2020-10-04 05:01:10 +00:00
* The `Text (plain)` field is provided by the StringItem class.
* The `Number (integer)` field is provided by the IntegerItem class.
* The `Text (formatted, long, with summary)` field is provided by the TextWithSummaryItem class.
* The `Image` field is provided by the ImageItem class.
2023-08-05 11:25:40 +00:00
The `schema` method defines the database columns used by the field to store its data. When migrating into subfields, processed data will ultimately be written into those database columns. Any restriction set by the database schema needs to be respected. That is why you do not use units when migrating width and height for images. The database only expects an integer number representing the corresponding values in pixels. Because of object oriented practices, sometimes you need to look at the parent class to know all the subfields that are available.
2020-10-04 05:01:10 +00:00
2023-08-05 11:25:40 +00:00
*Technical note*: By default, the Migrate API bypasses [Form API](https://api.drupal.org/api/drupal/elements/8.8.x) validations. For example, it is possible to migrate images without setting the `alt` subfield even if it marked as required in the field's configuration. If you try to edit a node that was created this way, you will get a field error indicating that the alternative text is required. Similarly, it is possible to write the `title` subfield even when the field is not expecting it, just like in today's example. If you were to enable the `title` text later, the information will be there already. For content migrations, you can enable validation by setting the `validate` configuration in the destination plugin:
```yaml
destination:
plugin: entity:node
validate: true
```
2020-10-04 05:01:10 +00:00
Another option is to connect to the database and check the table structures. For example, the `Image` field stores its data in the `node__field_image` table. Among others, this table has five columns named after the field's machine name and the subfield:
* field_image_target_id
* field_image_alt
* field_image_title
* field_image_width
* field_image_height
Looking at the source code or the database schema is arguably not straightforward. This information is included for reference to those who want to explore the Migrate API in more detail. You can look for migrations examples to see what subfields are available.
2023-08-05 14:00:35 +00:00
*Tip*: Many plugins are defined by classes whose name ends with the string `Item`. You can use your IDEs search feature to find the class using the name of the field as hint. Those classes would like in the `src/Plugin/Field/FieldType` folder of the module.
2020-10-04 05:01:10 +00:00
## Default subfields
Every Drupal field has at least one subfield. For example, `Text (plain)` and `Number (integer)` defines only the `value` subfield. The following code snippets are equivalent:
```yaml
process:
field_string/value: source_value_string
field_integer/value: source_value_integer
```
```yaml
process:
field_string: source_value_string
field_integer: source_value_integer
```
2023-08-05 14:00:35 +00:00
In previous chapters no subfield has been manually set, but Drupal knows what to do. The Migrate API offers syntactic sugar to write shorter migration plugins. This is another example. You can safely skip the default subfield and manually set the others as needed. For `File` and `Image` fields, the default subfield is `target_id`. How does the Migrate API know what subfield is the default? You need to check the code again.
2020-10-04 05:01:10 +00:00
2023-08-05 14:00:35 +00:00
The default subfield is determined by the return value of `mainPropertyName` method of the class providing the field type. Again, object oriented practices might require looking at parent classes to find this method. The `Image` field is provided by `ImageItem` which extends `FileItem` which itself extends `EntityReferenceItem`. It is the latter that contains the `mainPropertyName` returning the string `target_id`.