Loading...
A histogram is a chart that, while similar in shape to bar charts, has a completely different meaning. Histograms involve statistical concepts, first grouping data, then counting the number of data elements in each group. In a Cartesian coordinate system, the horizontal axis marks the endpoints of each group, the vertical axis represents frequency, and the height of each rectangle represents the corresponding frequency, called a frequency distribution histogram. Standard frequency distribution histograms require calculating frequency times class width to get the count for each group. Since the class width is fixed for the same histogram, using the vertical axis to directly represent counts, with each rectangle's height representing the corresponding number of data elements, preserves the distribution shape while intuitively showing the count for each group. All examples in this document use non-standard histograms with the vertical axis representing counts.
Related Concepts:
Functions of Histograms:
Through histograms, you can also observe and estimate which data is more concentrated and where abnormal or isolated data is distributed.
Other Names: Frequency Distribution Chart
Chart Type | Frequency Distribution Histogram |
---|---|
Suitable Data | List: one continuous data field, one categorical field (optional) |
Function | Show data distribution across different intervals |
Data-to-Visual Mapping | Grouped data field (statistical result) mapped to horizontal axis position Frequency field (statistical result) mapped to rectangle height Categorical data can use color to enhance category distinction |
Suitable Data Volume | No less than 50 data points |
Chart Type | Non-standard Histogram |
---|---|
Suitable Data | List: one continuous data field, one categorical field (optional) |
Function | Show data distribution across different intervals |
Data-to-Visual Mapping | Grouped data field (statistical result) mapped to horizontal axis position Count field (statistical result) mapped to rectangle height Categorical data can use color to enhance category distinction |
Suitable Data Volume | No less than 50 data points |
Example 1: Statistical Analysis of Data Distribution
The following chart shows a histogram of diamond weight distribution, displaying how diamond weights are distributed across different intervals.
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',theme: 'classic',autoFit: true,});chart.interval().data({type: 'fetch',value: 'https://gw.alipayobjects.com/os/antvdemo/assets/data/diamond.json',}).encode('x', 'carat').encode('y', 'count').transform({type: 'binX',y: 'count',}).scale({y: { nice: true }}).axis({x: { title: 'Diamond Weight (Carat)' },y: { title: 'Frequency' },}).style({fill: '#1890FF',fillOpacity: 0.9,stroke: '#FFF',});chart.render();
Notes:
carat
field is mapped to the horizontal axis, representing the range of diamond weightsinterval()
geometry with binX
transform to automatically calculate frequency in different intervalsExample 2: Using Different Binning Methods
The key to histograms is how to divide data intervals (i.e., "binning"). Different binning methods affect the understanding of data distribution. The chart below uses a custom number of bins.
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',theme: 'classic',autoFit: true,});chart.interval().data({type: 'fetch',value: 'https://gw.alipayobjects.com/os/antvdemo/assets/data/diamond.json',}).encode('x', 'carat').encode('y', 'count').transform({type: 'binX',y: 'count',thresholds: 30, // Specify number of bins}).scale({y: { nice: true }}).axis({x: { title: 'Diamond Weight (Carat)' },y: { title: 'Frequency' },}).style({fill: '#1890FF',fillOpacity: 0.9,stroke: '#FFF',});chart.render();
Notes:
transform: { type: 'binX', thresholds: 30 }
to specify 30 binsExample 3: Probability Distribution Analysis with Density Histogram
Density histograms normalize frequency counts, making them more suitable for comparing distributions of datasets of different sizes.
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',theme: 'classic',autoFit: true,});chart.interval().data({type: 'fetch',value: 'https://gw.alipayobjects.com/os/antvdemo/assets/data/diamond.json',}).encode('x', 'carat').encode('y', 'density').transform({type: 'binX',y: 'count',thresholds: 20,}, {type: 'normalizeY'}).axis({x: { title: 'Diamond Weight (Carat)' },y: {title: 'Density',labelFormatter: '.0%'}}).style({fill: '#2FC25B',fillOpacity: 0.85,stroke: '#FFF',});chart.render();
Notes:
binX
and normalizeY
transforms to convert frequency to densityExample 1: Not Suitable for Comparing Categorical Data
Histograms are designed for continuous numerical data distribution and are not suitable for comparing non-numerical categorical data. For counting statistics of categorical data, regular bar charts should be used instead.
Example 2: Not Suitable for Showing Time Series Trends
Histograms focus on showing data distribution characteristics rather than trends over time. If you need to display how data changes over time, line charts or area charts should be used instead.
A multi-distribution histogram can display the distribution of multiple datasets in the same coordinate system, facilitating comparison of distribution characteristics between different datasets.
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',theme: 'classic',autoFit: true,});chart.interval().data({type: 'fetch',value: 'https://gw.alipayobjects.com/os/antvdemo/assets/data/diamond.json',transform: [{type: 'map',callback: (d) => ({...d,group: d.cut === 'Ideal' ? 'Ideal' : 'Others',}),},],}).encode('x', 'price').encode('y', 'count').encode('color', 'group').transform({type: 'binX',y: 'count',thresholds: 30,groupBy: ['group']}).scale({y: { nice: true },color: {range: ['#1890FF', '#FF6B3B']}}).axis({x: { title: 'Price (USD)' },y: { title: 'Frequency' }}).style({fillOpacity: 0.7,stroke: '#FFF',lineWidth: 1}).legend(true);chart.render();
Notes:
encode('color', 'group')
and groupBy: ['group']
to achieve multi-distribution comparison