How deduplication works in Zapier
Zapier automatically deduplicates incoming trigger data for your integration, so that Zaps do not run multiple times on the same data. Consider the following requirements for your “New Item” and “Updated Item” triggers to work as users expect.
- Provide a unique primary key.
- By default, the field with the key
id
is used as the primary key. - Alternatively, if you’re using the CLI, you can choose other fields as the primary key by enabling
primary
inoutputFields
. See more information in the CLI docs.
- By default, the field with the key
- Sort reverse-chronologically by time created.
The API endpoint must list new or updated items in an array sorted in reverse chronological order.
Polling usually returns many results, most of which Zapier has seen before. Since a Zap should not trigger multiple times when an item in your app exists in multiple distinct polls, the data must be deduplicated.
For example, say your endpoint for new items returns a list of tasks:
The following assumes you’re using the default primary key, the id
field. When a Zap is first turned on, Zapier makes an initial call to your API to retrieve existing data, and caches and stores each id
field in our database. When the Zap is turned off, that list is cleared.
Active Zaps then poll at an interval (based on a customer’s plan) and compare the id
s to all those seen before, trigger on new items, and update the list of seen id
s.
Now let’s say the user created a new task:
On the next poll after the new task is created, your API returns all tasks, but only the first task with id
equal to 8
will be seen as a new item. That particular JSON object will then trigger the user’s Zap and complete any subsequent action steps the user has defined. The essence of deduplication is that the other id
s in the poll, 6
and 7
will be ignored, since their id
s have been seen in previous polls.
In order for deduplication to work, the id
field should always be supplied and unique amongst all items in the result.
Your API must return results in reverse-chronological order to make sure new/updated items can be found on the first page of results, as Zapier polling triggers don’t automatically fetch additional pages. If your API lists items in a different order by default, but allows for sorting, include an order or sorting field in your polling request to ensure newest records are returned on the first page of results. Github is a great example of an API that supports sorting on multiple fields in the asc
or desc
direction.
Re-ordering returning items
If your API cannot order its results in reverse-chronological order, you can use Code Mode to make additional requests, if necessary.
One possible scenario could be:
- Fetch the first page, containing the oldest items, but also the total number of pages.
- Fetch the last page and reverse the order of the items before returning results in an array.
When adding additional requests to your custom code, all requests and processing code for a trigger must finish within 30 seconds. It is not recommended to attempt to fetch all pages of results.
If the items your API returns do not have an id
field or you’re adding an Updated Item trigger, you will use Code Mode to modify the API response.
Custom primary keys
In older, legacy Zapier Web Builder apps, Zapier guessed what fields made a unique key if id
wasn’t present. It is now required that you define one or more fields as the primary key.
By default, Zapier uses the field with the key id
as the primary key if no outputFields
has primary: true
. The id
field would be required in this case. Otherwise, your trigger would run into an error at runtime. If your API’s items have a differently-named unique field, adapt this code snippet to ensure this test passes:
Alternatively, if you’re using the CLI, you can use non-id fields as the primary key. See “How does deduplication work?” in the CLI docs for example code.
Updated Item triggers
When triggering on updated items, you’ll want to define the id
field to be used for deduplication. Assuming your task API has an endpoint that can return tasks sorted by updatedAt
in descending direction, here’s example code to incorporate the updatedAt
value with the id
, so that Zapier recognizes a new update as a new item. Assuming that you have configured options
with the appropriate API request URL and parameters:
Notice how the code preserves the original id
value before setting id
to a new combined value that is unique for every update of a task. This is useful to ensure that the original ID can still be used for other purposes, such as performing a search or associating records together.
Alternatively, if you’re using the CLI, you can set primary: true
for the id
and updatedAt
fields. See “How does deduplication work?” in the CLI docs for example code.