There are often circumstances where adding client-provided sales data or other data to data that you already have will augment marketing efforts. However, due to security concerns, platform restrictions, or resource availability, it’s easy to find yourself in a scenario where a simpler approach, like a database extract saved as a CSV that’s uploaded regularly to an SFTP, is your best option to access that data. This is where this article’s focus lies. I will walk through a tried and true, reliable solution that only takes a minor amount of setup.
The 100-ft View
For the purposes of this article, we’ll use an example where the client is going to provide a database extraction of their sales data in CSV form to the designated SFTP location every day at an agreed upon time. Knowing what the file will contain, when it will be updated, and where to go and get it, we will set up two recurring tasks that will fire batch scripts to take it from there. The computer running the tasks will be permanently connected to our internal network drive where the historic sales files already live, so it has access to the appropriate destination where the current sales information is planned to live.
The first task will download the CSV from the secure SFTP, which we are expecting to receive from the client at a particular time every day in a folder we named “latest.” The script will then copy the file beside the other historic files on our internal network drive and move the CSV file located on the SFTP from the “latest” folder into a folder we named “archived.” This will allow us to refer back to the file at a later time if we wish to review the data the client has been sending us. It also gives us a visual indicator of which files have been processed.
Side note: You could also use a service like Zapier or Integromat to automate the element of consuming the data inside the CSV from the FTP. But in our example, we have historic sales data we want to combine with the current sales data coming in daily. Due to the sensitivity of this data, we chose to house it securely inside our internal network which does not permit external connections of any kind.
The second task runs the Tableau Prep workflow, which is set up to clean, aggregate, and publish a Tableau data source from the historic and current sales data files, located on the internal network drive, to Tableau Server for use in building a dashboard.
Let’s Get Started
Download WinSCP (SFTP client for Windows)
This is the free program we’ll use along with its CLI for all our SFTP file manipulation. You can download it here.
Download Demo Files
Download and update the demo files for this example. Feel free to create whatever folder structure is desired. All files do not have to exist in the same directory and certainly don’t have to exist beside your historic and current data files. You can download our example here.
Replace everything within brackets from the demo files with relevant information.
"contentUrl":"[TABLEAU SITE NAME]",
You can get the [TABLEAU SUBDOMAIN] and [TABLEAU SITE NAME] from the URL when you log into Tableau Online. For instance, https://[TABLEAU SUBDOMAIN] online.tableau.com/#/site/[TABLEAU SITE NAME]/home
It’s important to note that this JSON file does not encrypt your Tableau password. It should only be stored in a secure location. It is also my recommendation to create a separate Tableau user for the purpose of external integrations like this.
"[FULL PATH WHERE WINSCP IS INSTALLED]\winscp.com" /script="[FULL PATH TO WHERE THIS FILE LIVES]\SFTP-to-local-network.txt"
The purpose of the batch file referencing a text file is to create a cleaner multi-line list of commands.
option batch abort
option confirm off
open sftp://[SFTP USERNAME]:[SFTP PASSWORD]@[SFTP DOMAIN] -hostkey="ssh-rsa 1024 [SSH KEY FINGERPRINT]"
get -latest /latest/*.csv "[FULL PATH TO OUTPUT FOLDER]\[FILENAME].csv"
mv /latest/*.csv /archive/*.csv
If you have a difficult time getting your hands on the [SSH KEY FINGERPRINT] take a look at this helpful article that provides a few options for obtaining it.
"[FULL PATH TO TABLEAU PREP APPLICATION FOLDER]\scripts\tableau-prep-cli.bat" -c "="[FULL PATH TO WHERE THE CONNECTION PROPERTIES FILE LIVES]\connection-properties.json" -t [FULL PATH TO YOUR TABLEAU WORKFLOW FILE]\[FILENAME].tfl" pause
Run your batch files individually to verify they are in good working order before proceeding to the last step.
Finish It Up
Now that you’ve installed the necessary software and configured the demo files accordingly, the last step is to create your tasks from Windows Task Scheduler. This part is very straight forward and shouldn’t require administrator privileges. Simply follow the task scheduler wizard as it walks you through scheduling your batch files to run at the time of day and cadence desired.