— AWS, Serverless, Data, Architecture — 5 min read
Amazon AppFlow is an AWS serverless solution launched in April 2020. It eases the data transfer work from major SaaS products onto AWS. In this post, I will discuss how we’ve been using AppFlow with Salesforce in a client project. I will walk through different usage patterns we have adopted. I will also quickly touch on some lessons we learned along this journey.
photo credit: apple.com/au/wallet
Our client is in the retail business. Last year they launched a mobile solution to bring customer membership card onto Apple Wallet and Android Wallet Pass. As a user, you can download your membership pass onto your phone and use it next time you go shopping. No more plastic cards in your wallet! Here is the solution overview:
figure 1 - membership wallet solution overview
While this solution lays the groundwork, it merely shows your name and a barcode on your phone. Everything is static after you install the pass. But what about your total spending? How much more before you get an upgrade? Do you have any promotions available, or expiring? As a customer, you would expect a bit more, wouldn’t you?
Well, the overarching solution feels straightforward: as Apple/Google both support pushing new data changes to the wallet pass, all we need is to grab the data and send it to Apple/Google, easy-peasy!
However, just like your typical enterprise IT, our client manage their customer data primarily on Salesforce, with some Salesforce apps, and some custom objects and fields, and a couple of other SaaS products. Complex data models aside, there are more to be handled: data validation, data aggregation, business rules, not to mention solution availability, scalability and extensibility. It would be challenging for the existing solution to pull it off.
figure-2: complex data models and business rules
What we built is a serverless event-driven data solution sitting in between Salesforce and the current solution. It is part of the Data-as-a-Service (DaaS) platform we’ve been building for this client. This solution uses Amazon AppFlow to get data from multiple objects in Salesforce, aggregates and stores the data internally while applying a set of business rules, then sends the final data as events to the back-end of the existing solution.
figure-3: a DaaS solution leveraging Amazon AppFlow
While other services such as Amazon DMS can also help transfer Salesforce data, we choose Amazon AppFlow because of:
Overall, we found it easy and inexpensive for us to explore data sources and experiment with ideas during the R&D phase.
Having said that, AppFlow is still a new addition to the AWS services family. Its documentation is the bare minimum and lacks concrete examples. Even a search on Google returns limited results. Inevitably, we had to get over the learning curve, also found ourselves down the rabbit hole several times. But fear not! It is why I’m here: learning, sharing, and helping.
figure 4 - googling "AppFlow CloudFormation" returns six pages
Up next, technical nitty-gritty!
When setting up an AppFlow flow, you need to define 3 key factors:
These settings will define your AppFlow usage patterns. So take your time to explore all possibilities. In our solution, we use the following 3 patterns.
figure-5: on-demand AppFlow settings
AppFlow configured this way needs to be executed manually. Each execution fetches a complete data set from the source. We found this pattern can help:
figure-6: scheduled AppFlow settings
With scheduled setup, AppFlow flow runs periodically. AppFlow selectively transfers data objects that are a) new b) deleted c) modified.
AppFlow checks an object’s modification timestamp to decide to transfer the object or not. For Salesforce objects, the system field LastModifiedDate is normally used.
Note: AppFlow fetches ALL data fields defined in mapping table, including fields that have no changes since the last run. It is a significant difference from the next pattern.
figure-7: event-triggered AppFlow settings
AppFlow with this setup can send the data as EventBridge events, while Pattern 1 and 2 can only upload data as Line-delimited JSON objects in S3 buckets.
During the proof-of-concept phase, we thought Pattern 3 would be ideal because of the event-driven nature of our solution. But soon, we saw Pattern 3’s limitations. It only bring in incremental changes, meaning the output only has data fields that have changes.
In hindsight, a combination of Pattern 1 + Pattern 3 could be an alternative solution: a data baseline by Pattern 1 and following incremental changes by Pattern 3. But at the time, we felt Pattern 2 suited our needs and required less engineering effort, so it became the working horse of our solution.
Too keep this post short, I can only give you some rough ideas here. Each of these topics could be a separate blog with sample codes:
After getting over the initial learning curve, our engineering team has gained insight into Amazon AppFlow’s strengths and limitations. We now feel confident of making decisions on why/when/how to use AppFlow. Engineers in the team can proficiently embody AppFlow in their solutions to solve business problems.
While writing this blog, I just noticed some subtle yet helpful enhancements to the AppFlow creation process on AWS Console. How exciting it is to be one of the trailblazers on this AppFlow journey!
figure-8: high-level solution architecture
This blog was originally posted at https://cevo.com.au/post/amazon-appflow-with-salesforce-in-action/