How to Create a Pareto Chart in R Programming

Pareto charts strengthen decision-making, a terrific way to discover which variables are influential

Pierre DeBois

--

The Pareto principle — — in which 80% of a result comes from 20% of a given output— has been often cited in business practices over the years. It is popular as a root cause analysis when faced with several variables and you need to know which are creating the most influence. Even the late Colin Powell had a variation of that rule in weighing the information you have for making decisions.

But what tools are available for identifying that principle within data? A library in R, qcc, can provide answers. It uses the data in a data object to form a Pareto chart. This post will explain how to create and read a Pareto chart. Understanding how to insert data into a diagram like the Pareto chart is a great way to understand the data structure within R, and plan a data model accordingly.

Creating a Pareto chart with R programming

A Pareto chart, or diagram, is a blend of bar and line graphs that displays the given data according to an order of influence. The chart is meant to indicate which data is driving the majority of the results.

Pareto charts are useful for directing the audience's attention to the influential variables and then displaying the degree of significance among those variables. This makes it useful for choosing variables that represent categories of data that are worth developing further work or analyzing data categories that are most related to a process. A Pareto chart can be made in a spreadsheet and within a program.

The Pareto chart displays variables as bars of a bar chart, combined with a line graph representing the cumulative percentage of the variables. A dual Y-axis ties the bar and line chart together.

The bars are arranged in descending order, with the largest value (i.e. most influential) bars on the left side of the chart to the least value bars appearing on the right side. The line chart reflects how the cumulative percentage of the influence diminishes as the variables are viewed from left to right.

--

--

Pierre DeBois

#analytics |#datascience |#JS |#rstats |#marketing services for #smallbiz | #retail | #nonprofits Contrib @CMSWire @smallbiztrends #blackbusiness #BLM