You are reading the article Exploring Data Visualization In Altair: An Interesting Alternative To Seaborn updated in November 2023 on the website Minhminhbmm.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested December 2023 Exploring Data Visualization In Altair: An Interesting Alternative To Seaborn
This article was published as a part of the Data Science Blogathon
Data Visualization is important to uncover the hidden trends and patterns in the data by converting them to visuals. For visualizing any form of data, we all might have used pivot tables and charts like bar charts, histograms, pie charts, scatter plots, line charts, map-based charts, etc., at some point in time. These are easy to understand and help us convey the exact information. Based on a detailed data analysis, we can decide how to best make use of the data at hand. This helps us to make informed decisions.
Now, if you are a Data Science or Machine Learning beginner, you surely must have tried Matplotlib and Seaborn for your data visualizations. Undoubtedly these are the two most commonly used powerful open-source Python data visualization libraries for Data Analysis.
Seaborn is based on Matplotlib and provides a high-level interface for building informative statistical visualizations. However, there is an alternative to Seaborn. This library is called ‘Altair’, an open-source Python library built for statistical data visualization. According to the official documentation, it is based on the Vega and Vega-lite language. Using Altair we can create interactive data visualizations through bar chart, histogram, scatter plot and bubble chart, grid plot and error chart, etc. similar to the Seaborn plots.
While Matplotlib library is imperative in syntax done, and the machine decides the how part of it. This gives the user freedom to focus on interpreting the data rather than being caught up in writing the correct syntax. The only downside of this declarative approach could be that the user has lesser control over customizing the visualization which is ok for most of the users unfamiliar with the coding part.
Installing Seaborn and AltairTo install these libraries from PyPi, use the following commands
pip install altair pip install seaborn Importing Basic libraries and datasetAs always, we import Pandas and NumPy libraries to handle the dataset, Matplotlib and Seaborn along with the newly installed library Altair for building the visualizations.
#importing required libraries import pandas as pd import numpy as np import seaborn as sns Import matplotlib.pyplot as plt import altair as altWe will use the ‘mpg’ or the ‘miles per gallon’ dataset from the seaborn dataset library to generate these different plots. This famous dataset contains 398 samples and 9 attributes for automotive models of various brands. Let us explore the dataset more.
#importing dataset df = sns.load_dataset('mpg') df.shape #dataset column names df.keys()Output
‘acceleration’, ‘model_year’, ‘origin’, ‘name’],
dtype=’object’)
#checking datatypes df.dtypes #checking dataset df.head()This dataset is simple and has a nice blend of both categorical and numerical features. We can now plot our charts for comparison.
Scatter & Bubble plots in Seaborn and AltairWe will start with simple scatter and bubble plots. We will use the ‘mpg’ and ‘horsepower’ variables for these.
For Seaborn scatterplot, we can use either the relplot command and pass ‘scatter’ as the kind of plot
sns.relplot(y='mpg',x='horsepower',data=df,kind='scatter',size='displacement',hue='origin',aspect=1.2);or we can directly use the scatterplot command.
sns.scatterplot(data=df, x="horsepower", y="mpg", size="displacement", hue='origin',legend=True)whereas for Altair, we use the following syntax
alt.Chart(df).mark_point().encode(alt.Y('mpg'),alt.X('horsepower'),alt.Color('origin'),alt.OpacityValue(0.7),size='displacement')s using another attribute ‘origin’ and control the size of the points using an additional variable ‘displacement’ for both libraries. In Seaborn, we can control the aspect ratio of the plot using the ‘aspect’ setting. However, in Altair, we can also control the opacity value of the point by passing a value between 0 to 1(1 being perfectly opaque). To convert a scatter plot in Seaborn to a bubble plot, simply pass a value for ‘sizes’ which denotes the smallest and biggest size of bubbles in the chart. For Altair, we simply pass (filled=True) for generating the bubble plot.
sns.scatterplot(data=df, x="horsepower", y="mpg", size="displacement", hue='origin',legend=True, sizes=(10, 500)) alt.Chart(df).mark_point(filled=True).encode( x='horsepower', y='mpg', size='displacement', color='origin' )With the above scatter plots, we can understand the relationship between ‘horsepower’ and ‘mpg’ variables i.e., lower ‘horsepower’ vehicles seem to have a higher ‘mpg’. The syntax for both plots is similar and can be customized to display the values.
Line plots in Seaborn and AltairNow, we plot line charts for ‘acceleration’ vs ‘horsepower’ attributes. The syntax for the line plots is quite simple for both. We pass DataFrame as data, the above two variables as x and y while the ‘origin’ as the legend color.
Seaborn-
sns.lineplot(data=df, x='horsepower', y='acceleration',hue='origin')Altair-
alt.Chart(df).mark_line().encode( alt.X('horsepower'), alt.Y('acceleration'), alt.Color('origin') )Here we can understand that ‘usa’ vehicles have a higher range of ‘horsepower’ whereas the other two ‘japan’ and ‘europe’ have a narrower range of ‘horsepower’. Again, both graphs provide the same information nicely and look equally good. Let us move to the next one.
Bar plots & Count plots in Seaborn and AltairIn the next set of visualizations, we will plot a basic bar plot and count plot. This time, we will add a chart title as well. We will use the ‘cylinders’ and ‘mpg’ attributes as x and y for the plot.
For the Seaborn plot, we pass the above two features along with the Dataframe. To customize the color, we choose a palette=’magma_r’ from Seaborn’s predefined color palette.
sns.catplot(x='cylinders', y='mpg', hue="origin", kind="bar", data=df, palette='magma_r')In the Altair bar plot, we pass df, x and y and specify the color based on the ‘origin’ feature. Here we can customize the size of the bars by passing a value in the ‘mark_bar’ command as shown below.
plot=alt.Chart(df).mark_bar(size=40).encode( alt.X('cylinders'), alt.Y('mpg'), alt.Color('origin') ) plot.properties(title='cylinders vs mpg')From the above bar plots, we can see that vehicles with 4 cylinders seem to be the most efficient for ‘mpg’ values.
Here is the syntax for count plots,
Seaborn- We use the FacetGrid command to display multiple plots on a grid based on the variable ‘origin’.
g = sns.FacetGrid(df, col="cylinders", height=4,aspect=.5,hue='origin',palette='magma_r') g.map(sns.countplot, "origin", order = df['origin'].value_counts().index)Altair- We use the ‘mark_bar’ command again but pass the ‘count()’ for cylinders column as y to generate the count plot.
alt.Chart(df).mark_bar().encode( x='origin', y='count()', column='cylinders:Q', color=alt.Color('origin') ).properties( width=100, height=100 )From these two count plots, we can easily understand that ‘japan’ has (3,4,6) cylinder vehicles, ‘europe’ has (4,5,6) cylinder vehicles and ‘usa’ has (4,6,8) cylinder vehicles. From a syntax point of view, the libraries require inputs for the data source, x, y to plot. The output looks equally pleasing for both the libraries. Let us try a couple of more plots and compare them.
HistogramIn this set of visualizations, we will plot the basic histogram plots. In Seaborn, we use the distplot command and pass the name of the dataframe, name of the column to be plotted. We can also adjust the height and width of the plot using the ‘aspect’ setting which is a ratio of width to height.
Seaborn sns.distplot(df, x='model_year', aspect=1.2) Altair alt.Chart(df).mark_bar().encode( alt.X("model_year:Q", bin=True), y='count()', ).configure_mark( opacity=0.7, color='cyan' )In this set of visualizations, the selected default bins are different for both libraries, and hence the plots look slightly different. We can get the same plot in Seaborn by adjusting the bin sizes.
sns.displot(df, x='model_year',bins=[70,72,74,76,78,80,82], aspect=1.2)Now the plots look similar. However, in both the plots we can see that the maximum number of vehicles was after ’76 and prominently in the year ’82. Additionally, we used a configure command to modify the color and opacity of the bars, which sort of acts like a theme in the case of the Altair plot.
Strip plots using both LibrariesThe next set of visualizations are the strip plots.
For Seaborn, we will use the stripplot command and pass the entire DataFrame and variables ‘cylinders’, ‘horsepower’ to x and y respectively.
ax = sns.stripplot(data=df, y= ‘horsepower’, x= ‘cylinders’)For the Altair plot, we use the mark_tick command to generate the strip plot with the same variables.
alt.Chart(df).mark_tick(filled=True).encode( x='horsepower:Q', y='cylinders:O', color='origin' )From the above plots, we can clearly see the scatter of the categorical variable ‘cylinders’ for different ‘origin’. Both the charts seem to be equally effective in conveying the relationship between the number of cylinders. For the Altair plot, you will find that the x and y columns have been interchanged in the syntax to avoid a taller and narrower-looking plot.
Interactive plotsWe now come to the final set of visualization in this comparison. These are the interactive plots. Altair scores when it comes to interactive plots. The syntax is simpler as compared to Bokeh, Plotly, and Dash libraries. Seaborn, on the other hand, does not provide interactivity to any charts. This might be a letdown if you want to filter out data inside the plot itself and focus on a region/area of interest in the plot. To set up an interactive chart in Altair, we define a selection with an ‘interval’ kind of selection i.e. between two values on the chart. Then we define the active points for columns using the earlier defined selection. Next, we specify the type of chart to be shown for the selection (plotted below the main chart) and pass the ‘select’ as the filter for the displayed values.
select = alt.selection(type='interval') values = alt.Chart(df).mark_point().encode( x='horsepower:Q', y='mpg:Q', color=alt.condition(select, 'origin:N', alt.value('lightgray')) ).add_selection( select ) bars = alt.Chart(df).mark_bar().encode( y='origin:N', color='origin:N', x='count(origin):Q' ).transform_filter( select ) values & barsFor the interactive plot, we can easily visualize the count of samples for the selected area. This is useful when there are too many samples/points in one area of the chart and we want to visualize their details to understand the underlying data better.
Additional points to consider while using Altair Pie Chart & Donut ChartUnfortunately, Altair does not support pie charts. Here is where Seaborn gets an edge i.e. you can utilize the matplotlib functionality to generate a pie chart with the Seaborn library.
Plotting grids, themes, and customizing plot sizesBoth these libraries also allow customizing of the plots in terms of generating multiple plots, manipulating the aspect ratio or the size of the figure as well as support different themes to be set for colors and backgrounds to modify the look and feel of the charts.
Advanced plots ConclusionI hope you enjoyed reading this comparison. If you have not tried Altair before, do give it a try for building some beautiful plots in your next data visualization project!
Author BioDevashree has an chúng tôi degree in Information Technology from Germany and a Data Science background. As an Engineer, she enjoys working with numbers and uncovering hidden insights in diverse datasets from different sectors to build beautiful visualizations to try and solve interesting real-world machine learning problems.
In her spare time, she loves to cook, read & write, discover new Python-Machine Learning libraries or participate in coding competitions.
You can follow her on LinkedIn, GitHub, Kaggle, Medium, Twitter.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
You're reading Exploring Data Visualization In Altair: An Interesting Alternative To Seaborn
A Comprehensive Guide On Data Visualization In Python
This article was published as a part of the Data Science Blogathon
Data visualization is the process of finding, interpreting, and comparing data so that it can communicate more clearly complex ideas, thus making it easier to identify once analysis of logical patterns.
Data visualization is important for many analytical tasks including data summaries, test data analysis, and model output analysis. One of the easiest ways to connect with other people is to see good.
Fortunately, Python has many libraries that provide useful tools for extracting data from data. The most popular of these are Matplotlib, Seaborn, Bokeh, Altair, etc.
IntroductionThe ways we plan and visualize the details change quickly and become more and more difficult with each passing day. Due to the proliferation of social media, the availability of mobile devices, and the installation of digital services, data is available for any human activity using technology. The information produced is very important and enables us to analyze styles and patterns and to use big data to draw connections between events. Therefore, data recognition can be an effective way to present the end-user with
comprehensible details in real-time.
Image 1
Data visualization can be important for strategic communication: it helps us interpret available data; identify patterns, tendencies, and inconsistencies; make decisions, and analyze existing processes. All told, it could have a profound effect on the business world. Every company has data, be it contacting customers and senior management or helping to manage the organization itself. Only through research and interpretation can this data be interpreted and converted into information. This article seeks to guide students through a series of basic indicators to help them understand the perception of data and its components and equips them with the tools and platforms they need to create interactive views and analyze data. It seeks to provide students with basic names and crashes courses on design principles that govern data visibility so that they can create and analyze market research reports.
Table of Contents
What is Data Visualization?
Importance of data visualization
Data Visualization Process
Basic principles for data visualization
Data visualization formats
Data Visualization in Python
Color Schemes for Visualization of Data in Python
Other tools for data visualization
Conclusion
End Notes
Data visualization is the practice of translating data into visual contexts, such as a map or graph, to make data easier for the human brain to understand and to draw comprehension from. The main goal of data viewing is to make it easier to identify patterns, styles, and vendors in large data sets. The term is often used in a unique way, including information drawings, information visuals, and mathematical diagrams.
Image 2
Data visualization is one of the steps in the data science process, which, once data has been collected, processed, and modeled, must be visualized to conclude. Data detection is also a feature of the broader data delivery (DPA) discipline, which aims to identify, retrieve, manage, format, and deliver data in a highly efficient manner.
Viewing data is important for almost every job. It can be used by teachers to demonstrate student test results, by computer science artificial intelligence (AI) developers, or by information sharing managers and stakeholders. It also plays an important role in big data projects. As businesses accumulated large data collections during the early years of big data, they needed a way to quickly and easily view all of their data. The viewing tools were naturally matched.
Importance of Data VisualizationWe live in a time of visual information, and visual content plays an important role in every moment of our lives. Research conducted by SHIFT Disruptive Learning has shown that we usually process images 60,000 times faster than a table or text and that our brains do a better job of remembering them in the future. The study found that after three days, the analyzed studies retained between 10% and 20% of written or spoken information, compared to 65% of visual information.
The human brain can perceive imagery in just 13 milliseconds and store information, as long as it is associated with the concept. Our eyes can capture 36,000 visual messages per hour.
40% of nerve fibers are connected to the retina.
All of this shows that people are better at processing visual information, which is embedded in our long-term memory. As a result, in reports and statements, visual representation using images is a more effective way of communicating information than text or table; and takes up very little space. This means that data visibility is more attractive, easier to interact with, and easier to remember.
Data Visualization ProcessSeveral different fields are involved in the data recognition process, to facilitate or reveal existing relationships or discovering something new in a dataset.
1. Filtering and processing.
Refining and refining data transforms it into information by analyzing, interpreting, summarizing, comparing, and researching.
2. Translation & visual representation.
Creating visual representation by describing image sources, language, context, and word of introduction, all for the recipient.
3. Visualization and interpretation.
Finally, visual acuity is effective if it has a cognitive impact on
knowledge construction.
Basic principles for data visualizationThe purpose of seeing data is to help us understand
something they do not represent. It is a way of telling stories and research results, too as data analysis and testing platform. So, you have a good understanding of how to create data recognition will help us to create meaning as well as easy-to-remember reports, infographics, and dashboards. Creating the right perspective helps us to solve problems and analyze subject material in detail. The first step in representing the information is trying to understand that data perception.
1. Preview: This ensures that viewers have more data comprehension, as their starting point for checking. This means giving them a visual summary of different types of data, describing their relationship at the same time. This strategy helps us to visualize the process of data, in all its different levels, simultaneously.
2. Zoom in and filter: The second step involves inserting the first so that viewers can understand the data basement. Zoom in / out enables us to select available data subsets that meet certain methods while maintaining the concept of position and context.
Data visualization formats 1. Bar ChartsBar charts are one of the most popular ways to visualize data because it presents quickly set data
an understandable format that allows viewers to see height and depth at a glance.
They are very diverse and are often used comparing different categories, analyzing changes over time, or comparing certain parts. The three variations on the bar chart are:
Vertical column:
The data is used chronologically, too it should be in left-to-right format.
Horizontal column:
It is used to visualize categories
Full stacked column:
Used to visualize the categories that together add up to 100%
Source: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
2. Histograms
Histograms represent flexibility in the form of bars, where the face of each bar is equal to the number of values represented. They offer an overview of demographic or sample distribution with a particular aspect. The two differences in the histogram are:
Standing columns
Horizontal columns
Source: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
3. Pie charts
The pie chart contains a circle divided into categories, each representing a portion of the theme. They can be divided into no more than five data groups. They can be useful for comparing different or continuous data.
The two differences in the pie chart are:
Standard: Used to show relationships between components.
Donut: A variation of style that facilitates the inclusion of a whole value or design element in the center.
Source: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
4. Scatter PlotScatter plots sites use a point spread over the Cartesian integration plane to show the relationship between the two variables. They also help us determine whether the different data groups are related or not.
Source: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
5. Heat MapsSource: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
6. Line PlotThis is used to display changes or trends in data over time. They are especially useful in showing relationships, speeding, slowing down, a
nd instability in the data set.
Source: Netquest- A Comprehensive Guide to Data Visualization (Melisa Matias)
Color Schemes for Data Visualization in PythonColor is one of the most powerful data resources visual acuity, and it is important if we are to understand the details correctly. Color can be used to separate elements, balance or represents values, and interacts with cultural symbols associated with a particular color. It rules our understanding again so that we can analyze it, we must first understand its three types:
Hue: This is what we usually think of when we upload a photo color. There is no order of colors; they can only be distinguished by their characteristics (blue, red, yellow, etc.).
Brightness: This is an average measure that describes the amount of light reflected in an object with another. Light is measured on a scale, and we can talk about bright and dark values in one color.
Saturation
: this refers to the intensity of a given color. It varies according to light. Dark colors are less saturated, and when color is less saturated, they approach gray. In other words, it comes close to a neutral (empty) color. The following diagram provides a summary of the color application.
to Data Visualization (Melisa Matias)
Data Visualization in PythonWe’ll start with a basic look at the details, then move on to chart planning and finally, we’ll create working charts.
We will work with two data shares that will match the display we are showing in the article, data sets can be downloaded here
It is a description of the popularity of Internet search in three terms related to artificial intelligence (data science, machine learning, and in-depth learning). They were removed from a popular search engine.
There are two chúng tôi and chúng tôi files. The first one we will use in most studies includes data on the popularity of three words over time (from 2004 to now, 2023). In addition, I have added category variables (singular and zero) to show the functionality of charts that vary by category.
The chúng tôi file contains country-class preference data. We will use it in the final section of the article when working with maps.
Before we move on to the more sophisticated methods, let’s start with the most basic way of visualizing data. We will simply use pandas to look at the details and get an idea of how it is being distributed.
The first thing we have to do is visualize a few examples to see which columns, what information they contain, how the numbers are written.
In the descriptive command, we will see how the data is distributed, size, minimum, mean.
df.describe()With the information command, we will see what kind of data each column includes. We can find a column case that when viewed with a command of the head appears to be a number but if we look at the data following the values of the string format, the variable will be written as a character unit.
df.info() Data Visualization in Python using MatplotlibMatplotlib is the most basic library for viewing information about drawings. It includes as many graphs as we can think of. Just because it is basic does not mean that it is weak, many of the other viewing libraries we will be talking about are based on it.
Matplotlib charts are made up of two main elements, axes (lines separating the chart area) and a number (where we draw the X-axis and Y-axis). Now let’s build the simplest graph:
import matplotlib.pyplot as plt plt.plot(df['Mes'], df['data science'], label='data science')We can make graphs of many variations on the same graph and compare them.
plt.plot(df['Mes'], df['data science'], label='data science') plt.plot(df['Mes'], df['machine learning'], label='machine learning') plt.plot(df['Mes'], df['deep learning'], label='deep learning') plt.xlabel('Date') plt.ylabel('Popularity') plt.title('Popularity of AI terms by date') plt.grid(True) plt.legend()If you are working with Python from a terminal or script, after explaining the graph of the functions listed above use chúng tôi (). If working from Jupyter notebook, add% matplotlib to the queue at the beginning of the file and run it before creating a chart.
We can do many graphics in one number. This is best done by comparing charts or sharing information from several types of charts easily with a single image.
fig, axes = plt.subplots(2,2) axes[0, 0].hist(df['data science']) axes[0, 1].scatter(df['Mes'], df['data science']) axes[1, 0].plot(df['Mes'], df['machine learning']) axes[1, 1].plot(df['Mes'], df['deep learning'])We can draw a graph with different styles of different points for each:
plt.plot(df['Mes'], df['data science'], 'r-') plt.plot(df['Mes'], df['data science']*2, 'bs') plt.plot(df['Mes'], df['data science']*3, 'g^')Now let’s look at a few examples of different graphics we can make with Matplotlib. We start with the scatterplot:
plt.scatter(df['data science'], df['machine learning'])With Bar chart:
plt.bar(df['Mes'], df['machine learning'], width=20)With Histogram:
plt.hist(df['deep learning'], bins=15) Data Visualization in Python using SeabornSeaborn is a library based on Matplotlib. Basically what it offers us are beautiful drawings and works to create complex types of drawings with just one line of code.
We enter the library and start drawing style with chúng tôi (), without this command the graphics will still have the same style as Matplotlib. We show you one of the simplest graphics, scatterplot.
import seaborn as sns sns.set() sns.scatterplot(df['Mes'], df['data science'])We can add details of more than two changes to the same graph. In this case, we use colors and sizes. We also create a separate graph depending on the category column value:
sns.relplot(x='Mes', y='deep learning', hue='data science', size='machine learning', col='categorical', data=df)One of the most popular drawings provided by Seaborn is the heatmap. It is very common to use it to show all connections between variables in the dataset:
sns.heatmap(df.corr(), annot=True, fmt='.2f')Another favorite is the pair plot which shows the relationship between all the variables. Be aware of this function if you have a large database, as it should show all data points as often as columns, meaning that by increasing the data size, the processing time is greatly increased.
sns.pairplot(df)Now let’s make a pair plot showing charts divided into price range by category
sns.pairplot(df, hue='categorical')A very informative joint plot graph that allows us to see the spread plot as well as the histogram of two types and see how they are distributed:
sns.jointplot(x='data science', y='machine learning', data=df)Another interesting drawing is the VietnaminPlot:
sns.catplot(x='categorical', y='data science', kind='violin', data=df) Data Visualization in Python using BokehBokeh is a library that allows you to produce interactive graphics. We can send them to HTML text that we can share with anyone with a web browser.
It is a very useful library where we have the desire to look at things in drawings and want to be able to zoom in on a picture and walk around the picture. Or when we want to share it and allow someone else to test the data.
We start by entering the library and defining the file to save the graph:
from bokeh.plotting import figure, output_file, save output_file('data_science_popularity.html')We draw what we want and save it to a file:
p = figure(title='data science', x_axis_label='Mes', y_axis_label='data science') p.line(df['Mes'], df['data science'], legend='popularity', line_width=2) save(p) Other Tools for Data VisualizationSome data visualization tools help in visualizing the data effectively and faster than the traditional python coding method. These are some of the examples:
Databox
Databox is a data recognition tool used by more than 15,000 businesses and marketing agencies. Databox pulls your data in one place to track real-time performance with attractive displays.
Databox is ideal for marketing groups that want to be quickly set up with dashboards. With a single 70+ combination and no need to code, it is a very easy tool to use.
Zoho Analytics
Zoho Analytics is probably one of the most popular BI tools on this list. One thing you can be sure of is that with Zoho analytics, you can upload your data securely. Additionally, you can use a variety of charts, tables, and objects to transform your data concisely.
Tableau
If you want to easily visualize and visualize data, then Tableau is a tool for visualizing your data. It helps you to create charts, maps, and all other technical graphics. To improve your visual presentation, you can also get a desktop app.
Additionally, if you are experiencing a problem with the installation of any third-party application, then it provides a “lock server” solution to help visualize online and mobile messaging applications.
You can check out my article on Analytics Vidhya for more information on trending Data Visualization Tools. Top 10 Data Visualization Tools.
ConclusionWith all these different libraries you may be wondering which library is right for your project. The quick answer is a library that lets you easily create the image you want.
In the initial stages of the project, with pandas and pandas profiling we will make a quick visualization to understand the data. If we need to visualize more details we can use simple graphs that we can find in the plots such as scatterplots or histograms.
End NotesIn this article, we discussed Data Visualization. Some basic formats of data visualization and some practical implementation of python libraries for data visualization. Finally, we concluded with some tools which can perform the data visualization in python effectively.
Thanks For Reading!
About Me:
Hey, I am Sharvari Raut. I love to write!
Connect with me on:
Image Source
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
Exploring Data With Power View Multiples
Exploring Data with Power View Multiples
Multiples, also called Trellis Charts are a series of charts with identical X and Y axes. You can arrange Multiples side by side, to compare many different values easily at the same time.
You can have Line charts, Pie charts, Bar charts and Column charts as Multiples.
You can arrange the Multiples horizontally or vertically.
Line Charts as MultiplesYou might want to display the medal count by year for each Region. Firstly, you need to have the field Year. To get this field, you need to have a calculated column as follows −
Type =YEAR ([Edition]) in the formula bar and press Enter.
A new column with header CalculatedColumn1 is created with values corresponding to the Year values in Edition column.
Close the PowerPivot window. The Data Model gets updated. The new field – ∑ Year appears in the Power View Fields list.
Create a Table in Power View with fields NOC_CountryRegion, Count of Year and Medal Count, by dragging the fields.
Convert Table into a Line chart in Power View.
Remove the field NOC_CountryRegion. A Line chart appears with Medal Count by Year.
As you can observe, Year is in AXIS area and Medal Count is in ∑ VALUES area in Power View Fields list. In the Line chart, Year values are on X-axis and Medal count on Y-axis.
Now, you can create Multiples visualization with Line charts, as follows −
Drag the field NOC_CountryRegion to VERTICAL MULTIPLES area in the Power View Fields list.
You will get the Multiples Visualization with Line charts arranged as a grid, with each Line chart representing a country (NOC_CountryRegion).
Vertical MultiplesAs you are aware, you have placed the NOC_CountryRegion field in the VERTICAL MULTIPLES area. Hence, the visualization that you have got is the Vertical Multiples visualization. You can observe the following in the chart given above.
One Line chart per category that is placed in VERTICAL MULTIPLES area, in this case – the country.
The grid height and grid width that you have chosen determine the number of rows and number of columns for the Multiples.
A common x-axis for all the multiples.
A similar y-axis for each row of the multiples.
A vertical scroll bar on the right side that can be used to drag the rows of Line charts up and down, so as to make the other Line charts visible.
Horizontal MultiplesYou can have the Multiples Visualization with Horizontal Multiples also as follows −
Drag the field NOC_CountryRegion to VERTICAL MULTIPLES area.
Select the values for Grid Height and Grid Width in the Multiples group.
You will get the Horizontal Multiples visualization as follows −
You can observe the following in the above chart −
One Line chart per category that is placed in HORIZONTAL MULTIPLES area, in this case – the country.
The grid height that you have chosen determines the height of the Line charts, unlike the number of rows of Line charts as is the case in the VERTICAL MULTIPLES. In other words, there is a single row of Line charts with the height determined by the Grid Height that is chosen.
The grid width that you have chosen determines the number of columns of Line charts in the row.
A common x-axis for all the multiples.
A common y-axis for all the multiples.
A horizontal scroll bar at the bottom, below the x-axis, that can be used to drag the row of Line charts to the left and the right, so as to make the other Line charts visible.
Pie Charts as MultiplesIf you want to explore / visualize more than one category in Multiples, Pie charts is an option. Suppose you want to explore the medal count by medal type for each of the countries. Proceed as follows −
Select Pie from the dropdown under Other Chart.
Drag Medal to the area SLICES.
You will get the Horizontal Multiples visualization with Pie charts, as you have the field NOC_CountryRegion in the area HORIZONTAL MULTIPLES.
As you can observe the medal-count for each country is displayed as a Pie chart with the slices representing the medal types with the color as given in the Legend.
Suppose you want to highlight the count of gold medals for all the countries. You can do it in a single step as follows −
As you can observe, this gives a fast way of exploring and comparing the count of gold medals across the countries.
You might want to display more number of Pie charts in a visualization. You can do it by simply switching over to Vertical Multiples Visualization and choosing the right values for Grid Height and Grid Width for a proper display.
Bar Charts as MultiplesYou can choose Bar charts also for Multiples visualization.
Switch over to Stacked Bar visualization.
Adjust the Grid Height and Grid Width to get a proper display of the Bar charts.
With Grid Height of 6 and Grid Width of 2, you will get the following −
You can have Clustered Bar charts also for this visualization.
Column Charts as MultiplesYou can choose Column charts also for Multiples visualization.
Switch over to Stacked Column visualization.
Adjust the Grid Height and Grid Width to get a proper display of the Column charts.
With Grid Height of 2 and Grid Width of 6, you will get the following −
You can have Clustered Column charts also for this visualization.
Wrap-upThe fields you choose depend on what you want to explore, analyze and present. For example, in all the visualizations above, we have chosen Medal for Slices that helped to analyze medal count by medal type. You might want to explore, analyze and present the data gender-wise. In such a case, choose the field Gender for Slices.
Once again, the visualization that is suitable also depends on the data you are displaying. If you are not sure about the suitability, you can just play around to choose the right one as switching across the visualizations is quick and simple in Power View. Moreover, you can also do it in the presentation view, in order to answer any queries that can arise during a presentation.
Advertisements
Exploring Alternative Structures For Better Integrating Digital Marketing Activity Into Your Business
Do We Need a Digital Department At All?
When I began working on the Smart Insights Digital Transformation guide, I believed that the days of the digital department were numbered. After all, if digital integration was a true goal of a business, shouldn’t this department simply be merged into marketing and other ‘non-digital’ departments? I felt that we’d only created digital departments as a bolt on reaction to the changing landscape, and that over time different skills would simply be ‘absorbed’ into the rest of the business.
While this sentiment may run true amongst some readers, I soon found that this ideal has seldom been reached, and may never occur in many verticals. After all, it often seems that there will always be requirements for specific skills that need to sit within a specialist team. Rather than saying whether we ‘should’ or ‘should not’ have a digital department, there are varying ‘phases’ of digital integration.
Structuring Digital Marketing ActivitiesA common model for structuring digital marketing is based upon The Altimeter Group’s The Evolution of Social Business, which outlines five stages of social media integration.
The same phased approach can be seen where Neil Perkin writes for Econsultancy about these alternative digital marketing structures as explained below:
Dispersed – an early stage reaction to digital staffing, whereby skills are dispersed throughout an organisation.
Centre of Excellence – digital marketing personnel sit within one bespoke team, usually reporting to one Head of Digital.
Hub and Spoke – a combination of a digital ‘centre of excellence’ (hub) and ‘spokes’ that sit within separate departments.
Multiple Hub and Spoke – there are a number of separate digital hubs within departments, each with their own spokes in further business units.
Holistic – digital knowledge is at a strong level throughout the organisation.
No respondents within Econsultancy’s report, and only 2.4% in Altimeter’s study, answered that a ‘holistic’ level of integration had been reached. This obviously casts doubt on my initial suppositions: digital departments are likely to stay for the considerable future.
Having a Digital Centre is Standard PracticeIn many companies, the digital department exists is a separate entity to other divisions and is not wholly integrated into other departments – indeed, Altimeter’s study would suggest some 85% of companies are somewhere between stages 2-4 –all which demand a digital centre.
Why the Need for a Centre?Establishing a digital centre can be a reaction against a decentralized (largely ungoverned) structure. With the appointment of a head of department, there is greater emphasis on establishing process and a move towards a formal structure. Of course, by bringing this centrally, there can be a number of inherent weaknesses, the clearest being:
Potential barrier to effective multichannel marketing.
Lack of shared learning in the wider organisation.
Lack of focus on smaller business units.
So once a formal central structure is established, the next phase is to better integrate digital through the creation of ‘spokes’ – that is, digital skilled people sitting directly within particular teams. As demand progresses this model, these spokes may become larger, eventually with the ideal of the holistic stage being reached.
A Digital Centre Maybe Wholly NecessarySince digital is such a different and complex arena to more established channels, it appears there will still be a requirement for groups of specialists to sit together and work almost as an agency for the rest of the company. Thus it may not be possible for some businesses to completely move away from having a digital centre.
Some digital skills are distinct specialisms, and do not always require many hires for the business to operate well in these areas. For instance, analytics and SEO are often deemed to be the realm of specialists (although you might now argue SEO has become more of a ‘generalist’ role). Additionally, some companies simply may not be able to afford the fixed costs and headcount necessary to evolve to a hub and spoke approach.
It is also possible that the centre of excellence functions as an ‘innovation hub’ while the more integrated spokes work on digital execution. For instance, the central hub researches and tests new approaches and technology, and while the spokes are responsible for digital change management.
It is quite clear that full or holistic digital integration may not be possible in large companies. But conversely, maintaining a separated ‘digital center of excellence’ presents its own pitfalls, particularly in widening company understanding of digital marketing. It’s not time we said goodbye to the digital department, and for many, it doesn’t look like it will happen any time soon either. How do you see it?
Seng: An Auxo Alternative For Ios 8 Jailbreakers
I know I’ve caught a lot of flack in the past for comparing other tweaks to Auxo. Some people think that I’m biased when it comes to Auxo and its creators, but as I’ve said in the past, I just like to call a spade a spade.
If you get offended by the mere mention of Auxo, then you may want to jump right into the article, and skip the next sentence.
Seng is a tweak that’s heavily inspired by Auxo. There I said it. There’s simply no denying that fact.
But that doesn’t mean it’s not a great tweak. Auxo 3 isn’t even updated for iOS 8.4, so your options are limited if you enjoy its functionality. Developer Charlie Hewitt saw this as an opportunity, and has stepped in to offer Seng—a tweak that pays homage to the Auxo series, but finds enough solid footing to stand on its own two feet. Watch our in-depth video walkthrough for all of the details.
How to try SengFirst and foremost, Seng is still in beta, so you’re not going to find it on the stock Cydia repos. That said, it’s still extremely easy to get a hold of, as there’s a free beta for the tweak available right now.
Simply add the following beta repo to your list of Cydia sources:
Once you do, download the Seng beta, and you’ll be good to go.
OverviewAfter installation, head over to the Settings app and find Seng’s preference panel. Seng’s preferences are composed of three main parts—Top View Sections, Bottom View Sections, and Other Options.
The basic premise of Seng is that it merges the App Switcher with Control Center. Therefore, you have everything that you need at your fingertips by simply invoking the App Switcher. Seng also makes use of gestures so that you can slide up from the bottom of the screen to invoke the App Switcher as well.
The Top and Bottom View Sections allow you to customize the Control Center elements around the App Switcher. The App Switcher cards stay in the middle of the screen, and you can place Control Center elements on bottom, on top, or have a mix of the two. The App Switcher will resize itself automatically depending on how sandwiched-in it is. You can even have redundant elements on top and on bottom if you wish to do so.
Some of the items in the Bottom View Sections and Top View Sections have their own individual customization. For example, with music transport controls, you can choose to only display the controls when music is playing.
The great thing about Seng is that all of this can be customized on the fly with no resprings. It’s super-easy to play around with different setups in order to find the perfect one that works for you.
In addition, users will find Zephyr-inspired app dismiss functionality that can be customized via the Other Options section of Seng’s preferences. This makes it so that you can swipe up from the corner of the screen in order to dismiss an app, instead of invoking the App Switcher.
A worthy alternative?What’s remarkable about Seng is that, even though it’s in beta, I’ve found it to be extremely solid and stable. Granted, it’s still a beta, so your mileage may vary, but I was nonetheless impressed.
Is it as polished as Auxo? Few tweaks are. But it’s a solid alternative to Auxo on iOS 8.4, and it’s free to try if you’re okay with using a beta.
What are your thoughts on Seng? Have you tried it?
How To Adjust The Number Of Ticks In Seaborn Plots?
Introduction
Ticks are tiny symbols that Matplotlib uses to represent the positions of data points on both axes of a plot. They may be positioned to best fit the data range and are used to highlight certain locations on the x and y axes. Usually, ticks may be labeled to indicate the precise values they stand for. In the python package Seaborn, there are two functions, namely, xticks() and yticks() that can be used for adjusting the ticks of a given graph.
SyntaxTo adjust the number of ticks in Seaborn plots, we can use the following syntax −
# Set the tick locations and labels for the x-axis ax.set_xticks([tick1, tick2, ...]) ax.set_xticklabels([label1, label2, ...]) # Set the tick locations and labels for the y-axis ax.set_yticks([tick1, tick2, ...]) ax.set_yticklabels([label1, label2, ...])Both methods also have an optional minor parameter to set major or minor ticks. Here, ax is the axis object returned by the Seaborn plot function, and tick1, tick2, … are the desired tick locations, and label1, label2, … are the corresponding tick labels.
AlgorithmThe general step-by-step algorithm to adjust the number of ticks in Seaborn plots is as follows −
Choose the Seaborn plotting function you want to use such as sns.scatterplot().
Create some data or load some of your own.
The sns.set() and sns.set style() routines can be used to change the Seaborn theme and style.
To plot the data, utilize the chosen Seaborn plotting function.
Make a variable that points to the plot’s axes object.
To set the number of ticks on the x and/or y axes, use the set xticks() and/or set yticks() methods. A list of tick locations is the parameter for these functions.
To set the labels for the ticks on the x and/or y axes, use the set xticklabels() and/or set yticklabels() methods. A parameter for these functions is a list of tick labels.
Plot it on the window with show() method.
ExampleFollow along the example below to make your own Seaborn boxplot with custom tick locations and labels on the x-axis.
import seaborn as sns import matplotlib.pyplot as plt import numpy as np # Generate some random data data = np.random.randn(20) # Set up the Seaborn plot sns.set() sns.set_style("whitegrid") ax = sns.boxplot(x=data) # Set the tick locations and labels, can also use np array here ax.set_xticks([0, 1]) ax.set_xticklabels(["A", "B"]) # Show the plot plt.show()
Using the random.randn function in NumPy, we first create some random data. The set and set style functions are then used to set the visual style for the Seaborn plot.
By using the boxplot function on the data and saving the generated axis object in the variable axe, we can build a boxplot. The set xticks and set xticklabels methods of the axis object axe are then used to set the tick locations and labels for the x-axis.
In this instance, we are designating the tick locations as “A” and “B” and setting them to be at positions 0 and 1, respectively. Lastly, we use the pyplot module of matplotlib’s show function to display the plot. Be aware that the final plot may not seem particularly fascinating if you execute this code.
Due to the fact that we are just charting 20 randomly selected data points with just two ticks on the x-axis, the plot that is produced if you execute this code might not appear that fascinating. To produce more illuminating graphs, you may change the code to utilize your own data and adjust the tick placements and labels.
Example
import seaborn as sns import matplotlib.pyplot as plt import numpy as np # Generate some random data data = np.random.randn(20) # Set up the Seaborn line plot sns.set() sns.set_style("whitegrid") ax = sns.lineplot(x=[0, 1, 2], y=[1, 2, 3]) # Set the ytick locations and labels, can also use np array here ax.set_yticks([0, 1, 2, 3, 4]) ax.set_yticklabels(["A", "B", "C", "D", "E"]) # Show the plot plt.show()
Here, we are generating a line plot using the Seaborn library in Python. The plot has 5 y-ticks with labels “A”, “B”, “C”, “D”, and “E”.
Firstly, the Seaborn library is imported along with the Matplotlib library. Then, a NumPy array of random data is generated using the np.random.randn() method.
Next, the plot is set up using Seaborn with a whitegrid style. The line plot is generated using the sns.lineplot() method with the x-values and y-values specified.
To adjust the y-ticks, the ax.set_yticks() method is called with a list of values for the y-tick locations. The ax.set_yticklabels() method is then called with a list of labels for the y-ticks.
Finally, the plot is shown using the plt.show() method.
ConclusionUpdate the detailed information about Exploring Data Visualization In Altair: An Interesting Alternative To Seaborn on the Minhminhbmm.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!