Effective display of hierarchical data sets is still a major challenge in the field of information visualization. Numerous distinct methods exist for displaying hierarchical data sets. One such technique is termed the Treemap, which displays a tre
e using a 2-D space filling algorithm as an alternative to the traditional node/edge representation. In this paper, we discuss and demonstrate TreeMap 97', our Windows 95/NT application which implements treemaps given hierarchical datasets, and the featu
res we incorporated in it.
Hierarchical data is data on which a tree structure may be imposed. The directory of a computer drive is a good example of such a hierarchy. Other such hierarchies may include the makeup of large organizations and even the World Wide Web.
ProblemWhen the amount of data in such a hierarchy begins to get large, viewing the entire tree and retaining the information sought can be a tough task for the user. For one, a tree can grow in size very quickly as the depth of the tree increas
e. As a result, it becomes impossible to view a tree in its entirety. Methods commonly used to remedy this problem include using scroll bars to see sections of the tree which extend outside the viewing window. The use of this method in viewing hierarch
ical data, however, causes a bottleneck to the user's visual system, restricting the amount of information available on-screen to be far less than what the user is capable of processing visually. It also puts more stress on the user's ability to recall i
nformation and its location in the tree. This can certainly prove bothersome for reasonably large trees.
The treemap, developed by Dr. Ben Shneiderman, displays a tree using a 2-D space filling algorithm as an alternative to the traditional node/edge representation. Once completed, an entire tree structure is viewable on the screen. Also, color and size of
nodes can be manipulated to give user even more information on the hierarchical data set.
Data Source Implementation
TreeMap allows the user to enter any kind of hierarchical data set as long as
the data file is in a specific delimited format. The format we used for the data set is the
format used in the tree map applications at the Human-Computer Interaction Laboratory
at the University of Maryland -College Park. The data is stored in simple text files. A
semi-colon separates each item in the file. Semi-colons separate the different header, the
attributes, and each of the nodes. The first item in the file is an integer, that is not used by
TreeMap 97 The second item in the file is another integer. It tells the number of
attributes that each data point has. Next, there are names of all of the different attributes.
Attached to the front of the names of the attribute are integer values. The integer value
indicates the type of data that the attribute name is representing. A zero in front of the
attribute name means that the data this attribute name represents is an integer. A one tells
the program that this attribute is a real number, and a two in front of the attribute, means
that the value is a string. After the header information, the data values are stored. The first
open bracket signifies the beginning of the data. An open bracket signifies the start of a
node. Inside the bracket is all of the data values for that particular point. If the data is
followed by another open bracket, then the current node is a parent with children. If an
open bracket is closed without another open bracket in between, then the node is a leaf
node. The leaf nodes are the only nodes with meaningful data. The intermediate nodes
have dummy values for in their variables since there is no data associated with
intermediate nodes. For Treemap , we used to data sets. The first set was a statistical
data set of the National Basketball Association (NBA). The second data set was a
statistical data set of New York Stock Exchange. The data sets will be mentioned in
greater depth in the following paragraph.
The first data set is a file of the NBA statistics from the end of the 1992. The data
set consists of every player in the NBA and their statistics for that season. The statistics
included: total games played, scoring average, assist average, rebounding average, field
goal percentage, etc. The data file has the most meaningful statistics about the NBA. The
file has a total of 48 different attributes about the NBA. It has all of the twenty-eight
teams as well as the all of the players.
The other data set used was made up of information from the New York Stock
Exchange (NYSE) at the close of business on December 5, 1997. The file has all of the
companies that were participating in the selling and buying of stocks. The attributes range
from volume, value, and how and lows for the Dow Jones, to the change in percentage,
index, and performance.
TreeMap has worked better than expected, but there were some pitfalls that
hindered the development of the application. One problem encountered was the question
of the color scheme. What attribute do we color on and which values do we display with
colors are some of the issues that came up. The current version of TreeMap 97 offers the
user two ways to color. In the first case, distinct attribute values are assigned its own
color. Because there are more distinct values than we colors to select, we had to find a
way to assign values to colors. We decided to assign the colors on a first come, first
serve basis. The program will assign colors to the to the first 7 distant attribute values,
and with any value thereafter going to the same color. Another way color is used to view
data is by using one color and changing the intensity as the value of the attribute changes.
This method of color is more effective than the previous method when dealing with
attributes that have many different values. Another problem we encountered was trying
to find out how and where we would put the values of a selected data item. We decided to
put the data item. To alleviate this problem, we installed a group box on the right side of
the display. The group displays the attributes for the selected item in a list box. The group
also has the legend of colors, showing each color and value associated with it. The last set
of items in the group is the slide bars. These brings us to another problem: How does the
user select the attribute to color by or size on ? We solved this problem by installing the
two above mentioned slide bars. This was a concern because objects like radio boxes,
check boxes, and popup menus would be inadequate because TreeMap deals with
data sets with number of attributes ranging from 5 to 50. Using 50 check or radio boxes
would cause the user to scroll up and down the screen which may become confusing. The
scroll bar allows the user to search through every attribute before selecting the one he or
she wants. The two scroll bars allow the user to select size and color. A major problem
was handling the strings from the file. In some cases, the value of an attribute would be a
string instead of a number. TreeMap assumes that all of the values for attributes are
numerical and reads the value into a data structure that accepts only numerical. To
compensate for the strings, we converted the strings values to real values.
Using TreeMap '97
Here are some sample applications of TreeMap '97 at work. The NBA and Stock Portfolio Datasets were made available to us courtesy of the Human Computer Interaction Lab at the University of Maryland:
1991-1992 NBA Season Dataset
Here, color coding is done on Field Goals per Game while size is determined by Points. In the Chicago Bulls, Michael Jordan's wide, dark rectangle is matched by none and immediately demonstrates his dominance in the NBA this 91' - 92' season.
Here, color coding of increasing intensity is given to Field Goal Attempts per Game while size is determined by Three-Point Field Goals per game. Dominique Wilkins of the Atlanta Hawks seems to have the honor of wielding the best combination of the two a
ttributes in question.
Sizing is done by stock price high while coloring is done by stock price low. At a glance, though it seems Group 3 has the better combination of promising stocks based on these two indices (i.e. as a whole, the group seems bigger and darker than the othe
rs), Nucor Corp. of Group 2 seems to have everyone else beat.
TreeMap started off with the goal of trying represent tree maps using an
application developed in Delphi. TreeMap dynamically assigns the values to the
colors that represent them and it allows the user to select the attribute color is selected on.
This lets the user see the values as they change. A depth feature was added to TreeMap
to allow the user to limit the depth of the hierarchical data. For example, if the depth
is set to two, the user will only see the two levels of the tree, a parent and its child. This is
useful because it does not overwhelm the user with all of the data rectangles at once.
TreeMap allows the user to traverse from the highest level of the tree down to any
leaf node. Another achievement is the ability of TreeMap to view any kind of
hierarchical data that is in the specified delimited format. TreeMap also uses size to view
the difference in data values. The user can dynamically change the attribute that TreeMap
is basing the size on. Another accomplishment was the ability to assign the values of
attributes to the intensity of a color. This allows Tree to give different colors to each
of the different values since altering the RGB value of a color only changes the color
slightly. This give TreeMap '97 the whole spectrum of any color to map values. TreeMap
default color is blue when users implement this feature.
Although TreeMap achieved all of its initial goals, there are still some things
that will improve the application. One thing, that needs to be improved is the text that is
written in the rectangles. As the the amount of nodes increase, the size of the rectangle for
each of the nodes will begin to get smaller. As the rectangles decrease in size, the text in
the rectangle is not decreasing. This is causing the text in the rectangles to be truncated.
The font of the text should change in proportion to the change in the rectangles size.
Another shortcoming is the assignment of colors to the values of an attribute. In cases
where the amount distinct values are over 10, TreeMap will assign the colors to the
first 10 distinct values of the attribute. This is a poor method because it might not give a
accurate representation of the data. For example, if colors are assigned to values that only
appear once and other values which appear often are all assigned to the other color, it
will cause most of the rectangles to take on the other color. This makes it impossible
for the user to find out the values of the other attributes. A better solution would be to
have colors assigned to the values that appear the most. This will prove helpful in trying
to discover different patterns or trends. Another problem with TreeMap is its handling
of string values for attributes. TreeMap assumes that the values of the attributes are
numeric. TreeMap should be able to handle attribute values regardless of the data
TreeMap currently has some very nice features, but it will be that much better
with some changes to future versions. Future TreeMap versions will be able to change
the size of the text in rectangles as the size of the rectangle changes. This will eliminate
the problem of the text, located in the rectangle, becoming truncated when the size of the
rectangle becomes smaller. Another modification will deal with the assigning of colors to
values. Right now, TreeMap assigns colors to the first 10 distinct values. The new
version will assign the values that appear the most to the colors. This will help users
determine trends, popularity, or even patterns of an attribute. Future versions will also be
able to read any type of data as the value for an attribute. This will allow attributes to be
string values as well as numeric values. TreeMap will also store averages of all of the
parents children in the parent. An example of its purpose can be seen using the
basketball data. The leaf nodes are the players. The parents for the players would be the
teams. If the average of all the players data was stored in the team, it will be the team
averages. This is very useful if you were curious as to how the team is doing as a whole.
There is a variety of ways to represent hierarchical data. Each of which have
strong points and weak points. Tree maps are one way of representing hierarchical data.
Tree maps try to take advantage of the entire screen so that the most information can be
received. TreeMap is one application in a long line of Tree mapping applications.
TreeMap offers the user an allotment of features including coloring on intensity,
controllable depth of tree representation, and grouping attributes based on size and color.
There are some places where TreeMap can improve on. Problems like: truncation of
text, coloring on first ten distinct values, and the handling of strings present problems for
future programmers of TreeMap . TreeMap was a very successful start at to an
interesting problem. With thoughtful modifications, creative additions, and
determination, TreeMap has a chance to be as useful as some of the currently
existing tree map applications.
Click here to download this preliminary version
of Treemap 97'.
- Ben Shneiderman, Tree Visualization with Tree-maps: A 2-d
Space-filling Approach. ACM Transaction on Graphics (11)1 (January
- Brian Johnson and Ben Shneiderman, Tree-maps: A Space-Filling Approach
to the Visualization of Hierarchical Information Structures. Proc.
IEEE Visualization'91 (San Diego, California, October 1991), 284-291.
Reprinted in Ben Shneiderman (Editor), Sparks of Innovation in
Human-Computer Interaction, Ablex Publishers, Norwood, NJ, 1993,
- Ben Shneiderman, Visual User Interfaces for Information Exploration.
1991 Proc. of American Society for Information Sciences, 379-384.
- Ben Shneiderman, Designing the User Interface - Strategies for the
Effective Human-Computer Interaction, Third Edition. Addison Wesley,
Reading, Massachusetts, 1998, Chapter 15.
- Brian Johnson, TreeViz: Treemap Visualization of Hierarchically
Structured Information. Proc. ACM CHI'92 (Monterey, CA, May 1992),
- David Turo and Brian Johnson, Improving the Visualization of
Hierarchies with Treemaps: Design Issues and Experimentation. Proc.
IEEE Visualization'92 (Boston, October 1992), 124-130.
- Brian Johnson, Treemaps: Visualizing Hierarchical and Categorical
Data, Unpublished PhD. dissertation, Dept of Computer Science,
University of Maryland, College Park, MD, 1993.
- David Turo, Enhancing Treemap Displays via Distortion and Animation:
Algorithms and Experimental Evaluation, Unpublished Masters
dissertation, Dept of Computer Science, University of Maryland,
College Park, MD, 1993.
- Toshiyuki Asahi, David Turo, and Ben Shneiderman, Using Treemaps to
Visualize the Analytic Hierarchy Process, Department of Computer
Science Technical Report CS-TR-3293, College Park, MD, June 1994.
Information Systems Research 6, 4 (December 1995), 357-375.
- Harhsa Kumar, Catherine Plaisant, Marko Teittinen, and Ben
Shneiderman, Visual Information Management for Network
Configuration, Department of Computer Science Technical Report
CS-TR-3288, College Park, MD, June 1994.