There are many tools in the market of data preparation, but two common examples are Tableau Prep and Excel. These are useful to demonstrate some data prep techniques.
Preparing data in Excel
It’s an old favourite; tried and tested, affordable, powerful. There are also thousands of how-to guides online. It’s a great tool to start you off.
The data tools found in the Excel Data menu can be beneficial but are best for smaller datasets. Complex nested formulas used to prepare data can make sense to use initially but consider how fast they become unwieldy, complicated, prone to crashing Excel and hard to explain to non-technical colleagues.
Here are some useful data tools within Excel:
Excel contains a wealth of formulas. The staple formulas like SUM, SUMIF, IF, VLOOKUP are great for calculations and referencing. Press Shift F3 in Excel to launch the function browser the other formulas and functions.
Text to columns
Great for turning delimited text into columnar data. Highlight data, click ‘text to columns’ in the Data menu.
This makes easy work of deduping data.
Type some first names in column A, some surnames in column B and start typing email addresses in column C and you’ll see Excel flash fill figures out what you’re doing and suggests a flash fill to complete the process for you.
Static vs dynamic fill down
For a static fill, select the cell with data, including this cell highlight cells below, and press Ctrl D. For a dynamic fill, highlight two consecutive cells, hover the cursor over the bottom right corner of the bottom cell until a cross appears, click hold and drag down.
Find & Replace
Makes easy work of finding data and replacing it. For example, updating formula in a monthly report: find June2018, replace with July2018. Ctrl F then ALT P is a useful shortcut for find and replace.
For example, LEFT, RIGHT, MID, SEARCH, SUBSTITUTE all help you extract text from a cell. Here’s a great article to get you started.
The more advanced Excel data preparation tools are:
- Get Data (Data menu)
- Queries & Connections (Data menu)
It’s worth exploring these features if you are importing data from external sources and want to build queries to manipulate the datasets. It may sound complicated, but the Excel query editor makes it relatively easy to create queries by doing the coding for you.
DEEP DIVE LIVE: HOW DO YOU MAKE ‘BEST-IN-CLASS’ REPORTING A REALITY?
If you are looking to improve your reporting output and capability, this session is for you.
Oct 22, 2020 02:00 PM, London
Preparing data in a BI tool
Business intelligence (BI) tools like Tableau Desktop or PowerBI work best with tidy data; stored in columns and rows. Each row should be one item of data, and each column should be one attribute.
Tableau Prep is a tool to get your data in shape, ready for a BI tool. A dedicated data prep tool will empower you to do more. It will give you features beyond what Excel was designed to do and creates a reliable workflow to get your untidy data from disparate sources into one place, combined and ready for analysis.
Before moving one, take a couple of minutes to reflect on the data prep you are currently doing in Excel. Are you or your team doing a large amount of manual data prep work? If you could automate that in a data prep tool, would that time saved be useful to your team?
As an example, let’s say you are working with transaction data from 7 individual sheets which need to be unioned (combined).
You might have a manual Excel workflow where you combine multiple sheets in your weekly or monthly reporting process, maybe using macros to do the job. In Tableau prep, the union is done by drag and drop and takes seconds and doesn’t require macro or Visual Basic skills.
The next step in the workflow might be data preparation or cleaning up the data. In Tableau Prep, you can add a step to split out the data into columns using the delimiters. It’s straightforward to do and doesn’t require coding skills.
There are lots of data cleaning options, and each time you add something, Tableau Prep is building that into the workflow. The advantages are its simplicity, clarity of workflow, and you can apply the steps to the other workbooks.
Authored by Sam Ellis.
Sam Ellis is a data analytics expert and company accountant at Interworks.