Retrieving Data Extension records
In this article we will go over some basic data manipulation concepts for Salesforce Marketing Cloud. Starting with how to correctly retrieve data extension records. If you are new on SFMC and you need to understand what data extensions are, please check this article.
I'm going to split data manipulation scenarios in two: data retrieval for email personalization, and bulk/batch processes for data processing (integrations, segmentation, etc.) In all cases we will use SSJS script samples. As I mentioned here this blog is intended for full stack developers, and so for i will assume a preference for JavaScript over AMPscript.
First scenario: use data for email personalization. In this case you will fetch focused pieces of data and basically do replacements, if/conditional statements and basic loops of data to generate content. Under no circumstance you will retrieve hundreds or thousands of records and iterate those, as that could impact dramatically in the email deployment process performance.
The second scenario applies for different types of batch processes where a large set of records are retrieved for some kind of purpose. For example: an example use case could be iterating through a data extension that contains contact data, parsing and transforming some specific fields, and impact a final master contact Data Extension. Initially our process needs to retrieve all records from a data extension. A first attempt can look like this:
<script runat=server>
Platform.Load("core","1");
function ProcessRow(row) {
var subskey = row["SubsKey"];
var customField = row["CustomField"];
//do process
}
var audienceDE = DataExtension.Init("my-audience");
var rows = audienceDE.Rows.Retrieve();
for(var i in rows) {
ProcessRow(Rows[i]);
}
</script>
That is a quite simple way to retrieve records. But Rows.Retrieve function has the limitation that it will return only 2500 rows tops. If we are working in a batch process like the described scenario, this can be a blocker. Link to documentation:
The alternative to avoid that problem is to use the Salesforce API to retrieve all data from the data extension no matter the record count. That can be achieved by following this documentation:
Our sample code in that case will look something like this:
<script runat=server>
Platform.Load("core","1");
function ProcessRow(row) {
var subskey = row.Properties[0].Value;
var customField = row.Properties[1].Value;
//do process
}
var prox = new Script.Util.WSProxy(),
objectType = "DataExtensionObject[my-audience]",
cols = ["SubsKey", "CustomField"],
moreData = true,
reqID = null;
while(moreData) {
moreData = false;
var data = reqID == null ? prox.retrieve(objectType, cols)
: prox.getNextBatch(objectType, reqID);
if(data != null) {
moreData = data.HasMoreRows;
reqID = data.RequestID;
if(data && data.Results) {
for(var i=0; i< data.Results.length; i++) {
ProcessRow(data.Results[i]);
}
}
}
}
</script>
That is a significant amount of code to basically select all records from a data extension for processing. If you agree with that statement you will find this library of interest. Your code will look like this in that case:
<script runat=server>
Platform.Load("core","1");
Platform.Function.ContentBlockByKey('data-extension-utils');
function ProcessRow(row) {
var subskey = row.SubsKey;
var customField = row.CustomField;
//do process
}
var cols = ["SubsKey", "CustomField"];
QueryDataExtension("my-audience", cols, ProcessRow);
</script>
That is much cleaner, also you will get the advantage of accessing directly object attributes instead of key value arrays that are returned by Retrieve functions from WSProxy or Row objects.