Trigger data deduplication

Zapier automatically deduplicates incoming trigger data for you, so that Zaps do not run multiple times on the same data, but you must be mindful of how it works. The important bits are outlined below!

Deduplication tl;dr:

  • Provide a unique id key.
  • Sort reverse-chronologically by time created.

An unfortunate artifact of polling for new data is that polling usually returns many results, most of which Zapier has seen before. Since we don’t want to trigger an action multiple times when an item in your API exists in multiple distinct polls, we must deduplicate the data.

For example, say your endpoint returns a list of to-dos:

[
  {
    "id": 7,
    "created": "Mon, 25 Jun 2012 16:41:54 -0400",
    "list_id": 1,
    "description": "integrate our api with zapier",
    "complete": false
  },
  {
    "id": 6,
    "created": "Mon, 25 Jun 2012 16:41:45 -0400",
    "list_id": 1,
    "description": "get published in zapier library",
    "complete": false
  }
]

When a Zap is turned on, we make an initial call to your API to retrieve existing data, and cache and store each id field in our database.

After this, we will poll at an interval (based on a customer’s plan) looking for changes against this cached list of ids.

Now let’s say the user created a new to-do:

[
  {
    "id": 8,
    "created": "Mon, 25 Jun 2012 16:42:09 -0400",
    "list_id": 1,
    "description": "re-do our api to support webhooks",
    "complete": false
  },
  {
    "id": 7,
    "created": "Mon, 25 Jun 2012 16:41:54 -0400",
    "list_id": 1,
    "description": "integrate our api with zapier",
    "complete": false
  },
  {
    "id": 6,
    "created": "Mon, 25 Jun 2012 16:41:45 -0400",
    "list_id": 1,
    "description": "get published in zapier library",
    "complete": false
  }
]

On the next poll after the new to-do is created, the API returns all to-dos, but only the first to-do with id equal to 8 will be seen as a new item. That particular JSON object will then be passed through to the action in the user’s Zap. The others will be ignored, since we already have seen their ids. That’s the essence of deduplication.

In order for this mechanism to work, the id field should always be supplied and unique among all items in the result.

Your API must also be able to return results in reverse-chronological order to make sure new/updated items can be found on the first page of results, as Zapier polling triggers don’t automatically fetch additional pages. Wufoo is a great example of an API that supports sorting on any field and direction.

Custom or multiple ID fields

What if the items your API returns do not have an id field? Or how would you go about adding a Updated to-do trigger to our to-do app? In both cases you would use Code Mode to modify the API response.

Let’s assume your to-do API has an endpoint that can return to-dos sorted by updatedAt in descending direction. Here’s an example of how you could add code to your trigger to incorporate the updatedAt value in the ID, so that Zapier recognizes a new update as a new item. Assuming that you have configured options with the appropriate API request URL and parameters:

return z.request(options)
  .then((response) => {
    response.throwForStatus();
    const results = response.json;
    results.forEach(function(result) {
      result.originalId = result.id;
      result.id = result.id + '-' + result.updatedAt;
    });
    return results;
  });

Notice how the code preserves the original id value before setting id to a new combined value that is unique for every update of a to-do. This is useful to ensure that the original ID can still be used for other purposes, such as performing a search or associating records together.

Re-order items

What if your API cannot order its results in reverse-chronological order? How would you fetch the last page if you don’t know how many pages there might be? You can also use Code Mode to make additional requests, if necessary.

One possible scenario could be:

  1. Fetch the first page, containing the oldest items, but also the total number of pages.
  2. Fetch the last page and reverse the order of the items before returning it.

You should be cautious when adding additional requests to your custom code. All your requests and processing code for a trigger must finish within 30 seconds. It’s not recommended to attempt to fetch all pages of results.


Need help? Tell us about your problem and we’ll connect you with the right resource or contact support.