Migrating Salesforce Legacy Attachments to Salesforce Files Using StarfishETL

If you are using Salesforce.com, you've probably made the transition to using the new Files functionality that uses ContentDocument objects. The old attachments functionality is often re-enabled so that the historical documents can still be made available. This approach is however not streamlined as it requires two places to look for documents. So ideally, we would want to migrate the old attachments into the new format. For this, we recommend using a StarfishETL solution that involves transforming the data from the legacy “Attachments” object into the newer Content Document object format. 

First, we start with extracting the attachment from the legacy Attachment object. 

A simple query will pull the necessary information that is needed:

example query

Next create two data processing stages in StarfishETL to process data into the following:

ContentVersion object

This object stores the version of the ContentDocument that is being uploaded. Notice that you don’t need to explicitly upload to the ContentDocument object. This is handled automatically when you upload to the ContentDocumentVersion object. 

To optimize for performance, it is better to NOT read the BODY field ( i.e. attachment file) for all records in one large query. Instead, read all the other fields (except BODY) and then for each record, run a single retrieve query to pull the attachment file. 

The reason for this is when running a query where all attachment rows are retrieved, the API will attempt to pull all the attachments and their files in one big swoop which will severely slow down the migration process. So basically, we need to get the header level information for each attachment and when processing each attachment record, we pull just that attachment file and push it to the ContentDocumentVersion table.

Here is how to map the fields from Attachment to ContentDocumentVersion using StarfishETL

 Notice the function field on the “VersionData” mapping that runs a query to retrieve the attachment file. The MC_Attachment_ID__c is a custom field to store the ID of the original attachment so it can be linked back to the old attachment record for cross checking. 

Once the attachment has been uploaded to this object, the next step is to handle its associations to various records in Salesforce.

ContentDocumentLink object

This object allows an attachment to be linked to its related record. This linkage allows the record to be visible when browsing the linked account, contact etc.

To implement this, we need to map to the ContentDocumentLink object as indicated below:

ContentDocumentIdThe ContentDocumentId retrieved from the previous job
LinkedEntityIdThe Id of the record this attachment should be related to
ShareTypeHardcode to “I”

If the stages are properly set up and executed in StarfishETL, you should see the legacy attachments moved over to the new documents area and visible as files when browsing the appropriate records in Salesforce.

Posted in:

Start a Project with us

Fill out the form below and we will contact you