Data Analysis using Spreadsheets
Microsoft Excel provides several means and ways to analyze and interpret data. The data can be from various sources. The data can be converted and formatted in several ways. It can be analyzed with the relevant Excel commands, functions and tools – encompassing Conditional Formatting, Ranges, Tables, Text functions, Date functions, Time functions, Financial functions, Subtotals, Quick Analysis, Formula Auditing, Inquire Tool, What-if Analysis, Solvers, Data Model, PowerPivot, PowerView, PowerMap, etc.
You will be learning these data analysis techniques with Excel as part of two parts −
- Data Analysis with Excel and
- Advanced Data Analysis with Excel
Data Analysis is a process of collecting, transforming, cleaning, and modeling data with the goal of discovering the required information. The results so obtained are communicated, suggesting conclusions, and supporting decision-making. Data visualization is at times used to portray the data for the ease of discovering the useful patterns in the data. The terms Data Modeling and Data Analysis mean the same.
Data Analysis Process consists of the following phases that are iterative in nature −
- Data Requirements Specification
- Data Collection
- Data Processing
- Data Cleaning
- Data Analysis
Data Requirements Specification
The data required for analysis is based on a question or an experiment. Based on the requirements of those directing the analysis, the data necessary as inputs to the analysis is identified (e.g., Population of people). Specific variables regarding a population (e.g., Age and Income) may be specified and obtained. Data may be numerical or categorical.
Data Collection is the process of gathering information on targeted variables identified as data requirements. The emphasis is on ensuring accurate and honest collection of data. Data Collection ensures that data gathered is accurate such that the related decisions are valid. Data Collection provides both a baseline to measure and a target to improve.
Data is collected from various sources ranging from organizational databases to the information in web pages. The data thus obtained, may not be structured and may contain irrelevant information. Hence, the collected data is required to be subjected to Data Processing and Data Cleaning.
The data that is collected must be processed or organized for analysis. This includes structuring the data as required for the relevant Analysis Tools. For example, the data might have to be placed into rows and columns in a table within a Spreadsheet or Statistical Application. A Data Model might have to be created.
The processed and organized data may be incomplete, contain duplicates, or contain errors. Data Cleaning is the process of preventing and correcting these errors. There are several types of Data Cleaning that depend on the type of data. For example, while cleaning the financial data, certain totals might be compared against reliable published numbers or defined thresholds. Likewise, quantitative data methods can be used for outlier detection that would be subsequently excluded in analysis.
Data that is processed, organized and cleaned would be ready for the analysis. Various data analysis techniques are available to understand, interpret, and derive conclusions based on the requirements. Data Visualization may also be used to examine the data in graphical format, to obtain additional insight regarding the messages within the data.
Statistical Data Models such as Correlation, Regression Analysis can be used to identify the relations among the data variables. These models that are descriptive of the data are helpful in simplifying analysis and communicate results.
The process might require additional Data Cleaning or additional Data Collection, and hence these activities are iterative in nature.
The results of the data analysis are to be reported in a format as required by the users to support their decisions and further action. The feedback from the users might result in additional analysis.
The data analysts can choose data visualization techniques, such as tables and charts, which help in communicating the message clearly and efficiently to the users. The analysis tools provide facility to highlight the required information with color codes and formatting in tables and charts.
Excel Data Analysis – Overview
Excel provide commands, functions and tools that make your data analysis tasks easy. You can avoid many time consuming and/or complex calculations using Excel. In this tutorial, you will get a head start on how you can perform data analysis with Excel. You will understand with relevant examples, step by step usage of Excel commands and screen shots at every step.
Ranges and Tables
The data that you have can be in a range or in a table. Certain operations on data can be performed whether the data is in a range or in a table.
However, there are certain operations that are more effective when data is in tables rather than in ranges. There are also operations that are exclusively for tables.
You will understand the ways of analyzing data in ranges and tables as well. You will understand how to name ranges, use the names and manage the names. The same would apply for names in the tables.
Data Cleaning – Text Functions, Dates and Times
You need to clean the data obtained from various sources and structure it before proceeding to data analysis. You will learn how you can clean the data.
- With Text Functions
- Containing Date Values
- Containing Time Values
Excel provides you conditional formatting commands that allow you to color the cells or font, have symbols next to values in the cells based on predefined criteria. This helps one in visualizing the prominent values. You will understand the various commands for conditionally formatting the cells.
Sorting and Filtering
During the preparation of data analysis and/or to display certain important data, you might have to sort and/or filter your data. You can do the same with the easy to use sorting and filtering options that you have in Excel.
Subtotals with Ranges
As you are aware, PivotTable is normally used to summarize data. However, Subtotals with Ranges is another feature provided by Excel that will allow you to group / ungroup data and summarize the data present in ranges with easy steps.
With Quick Analysis tool in Excel, you can quickly perform various data analysis tasks and make quick visualizations of the results.
Understanding Lookup Functions
Excel Lookup Functions enable you to find the data values that match a defined criteria from a huge amount of data.
With PivotTables you can summarize the data, prepare reports dynamically by changing the contents of the PivotTable.
You will learn several Data Visualization techniques using Excel Charts. You will also learn how to create Band Chart, Thermometer Chart, Gantt chart, Waterfall Chart, Sparklines and PivotCharts.
It might be required that only valid values be entered into certain cells. Otherwise, they may lead to incorrect calculations. With data validation commands, you can easily set up data validation values for a cell, an input message prompting the user on what is expected to be entered in the cell, validate the values entered with the defined criteria and display an error message in case of incorrect entries.
Excel provides you several financial functions. However, for commonly occurring problems that require financial analysis, you can learn how to use a combination of these functions.
Working with Multiple Worksheets
You might have to perform several identical calculations in more than one worksheet. Instead of repeating these calculations in each worksheet, you can do it one worksheet and have it appear in the other selected worksheets as well. You can also summarize the data from the various worksheets into a report worksheet.
When you use formulas, you might want to check whether the formulas are working as expected. In Excel, Formula Auditing commands help you in tracing the precedent and dependent values and error checking.
Excel also provides Inquire add-in that enables you compare two workbooks to identify changes, create interactive reports, and view the relationships among workbooks, worksheets, and cells. You can also clean the excessive formatting in a worksheet that makes Excel slow or makes the file size huge.