# Charts

In charts, data is represented graphically, therefore they are used in statistics mainly to get an overview of the collected data and to prepare information in an easily understandable way.

The most commonly used charts in statistics are bar charts, histograms, scatter plots, line plots, box plots or pie charts.

## Bar charts

Bar charts are probably the most commonly used charts in statistics. Bar charts are usually used to show the frequency of different categories, but also to visualize numerical data, such as sales figures or population statistics.

In a bar chart, the length of each bar is proportional to the value it represents. The bars are usually arranged horizontally or vertically.

A histogram is a graphical representation of the frequency distribution of a metric variable. To display a distribution of data in a histogram, the data must first be divided into classes, also called bins. These classes or bins are then represented by rectangles that lie directly next to each other.

This is also the main difference to bar charts, in a bar chart the data are already grouped from the beginning and do not have to be divided into groups as in a histogram. This becomes graphically clear, as in a bar chart there is a distance between the bars.

### Bar chart vs. Histogram

A bar chart and a histogram are both types of graphical representations of data, but they are used to display different types of information.

A bar chart is used to represent discrete data, where the data is divided into separate categories. The height of each bar represents the frequency or quantity of the data that falls into that category.

A histogram, on the other hand, is used to represent continuous data, where the data is divided into a set of bins or intervals. The height of each bar represents the frequency or quantity of the data that falls into that bin or interval. The bars in a histogram are usually adjacent and there is no space between them.

In summary, the main difference between a bar chart and a histogram is the type of data they represent and the way the data is divided and displayed.

Accordingly, histograms are used for metric variables such as salary or age, and bar charts for ordinal or nominal variables such as gender or school grade.

## Example Histogram

We would like to display the frequency distribution of the results of a statistics exam graphically. For this we have the respective scores of 150 students.

## Scatter Plots

Scatter plots are used in statistics to visualize correlations in data. In a scatterplot always two variables can be plotted, this is done by representing each pair of values of a case as a point in a coordinate system. If, for example, 10 persons are asked for their weight and height, the scatterplot shows 10 points.

With the help of the scatterplot you get a first indication of the correlation between the two visualized variables. If high values of one variable are associated with high values of the other variable, there is a positive correlation. If high values of one variable are associated with low values of the other variable, there is a negative correlation. If the points are randomly distributed, there is no correlation.

Furthermore, there can also be a non-linear relation, in this case there is a pattern in the distribution of the points, but no straight line can be drawn through the points.

## Line charts

A line chart is a graph consisting of a series of data points connected by a line. It is used, for example, to show a continuous change of data over time. In a line chart, time or the other continuous variable is plotted on the horizontal axis, while the values of the data to be illustrated are plotted on the vertical axis.

Line charts are particularly useful for visualizing trends and changes over time, and they are often used to represent economic and financial data, weather data, or scientific data.

## Boxplots

Boxplots are charts used to represent distributions of data. They provide a visual summary of the data by presenting important statistical measures such as median, quartiles, and outliers in a single graph.

Which diagram you should create depends on the information you want to convey and the scale level of your data.

In statistics it is advisable to first examine the data by means of the created diagrams, this already gives an indication whether there are differences in the individual groups, for example. Subsequently, the visual results can be verified with hypothesis tests.

