Get help with this assignment and many more by clicking here, or simply subscribing to
Explain the concept of an entity and attribute. What are five attributes for the entity “student” for a university system tracking information on students? Describe whether the entity is a person, place, or thing for the scenario. Refer to Chapter 6 (uploaded) in your text to find additional information on entities and attributes. You must also explain why these attributes should be included in the entity.
6.1 WHAT ARE THE PROBLEMS OF MANAGING DATA
RESOURCES IN A TRADITIONAL FILE
An effective information system provides users with accurate, timely, and relevant information. Accurate information is free of errors. Information is timely when it is available to decision makers when it
is needed. Information is relevant when it is useful and appropriate
for the types of work and decisions that require it.
You might be surprised to learn that many businesses don’t have timely,
accurate, or relevant information because the data in their information systems
have been poorly organized and maintained. That’s why data management is
so essential. To understand the problem, let’s look at how information systems
arrange data in computer files and traditional methods of file management.
FILE ORGANIZATION TERMS AND CONCEPTS
A computer system organizes data in a hierarchy that starts with bits and
bytes and progresses to fields, records, files, and databases (see Figure 6.1).
A bit represents the smallest unit of data a computer can handle. A group
of bits, called a byte, represents a single character, which can be a letter, a
number, or another symbol. A grouping of characters into a word, a group
of words, or a complete number (such as a person’s name or age) is called a
field. A group of related fields, such as the student’s name, the course taken,
the date, and the grade, comprises a record; a group of records of the same
type is called a file.
For example, the records in Figure 6.1 could constitute a student course file.
A group of related files makes up a database. The student course file illustrated
in Figure 6.1 could be grouped with files on students’ personal histories and
financial backgrounds to create a student database.
A record describes an entity. An entity is a person, place, thing, or event on
which we store and maintain information. Each characteristic or quality describing a particular entity is called an attribute. For example, Student_ID, Course,
Date, and Grade are attributes of the entity COURSE. The specific values that
these attributes can have are found in the fields of the record describing the
PROBLEMS WITH THE TRADITIONAL FILE
In most organizations, systems tended to grow independently without a
company-wide plan. Accounting, finance, manufacturing, human resources,
and sales and marketing all developed their own systems and data files.
Figure 6.2 illustrates the traditional approach to information processing.
Each application, of course, required its own files and its own computer
program to operate. For example, the human resources functional area might
have a personnel master file, a payroll file, a medical insurance file, a pension
file, a mailing list file, and so forth until tens, perhaps hundreds, of files and
programs existed. In the company as a whole, this process led to multiple
master files created, maintained, and operated by separate divisions or departments. As this process goes on for 5 or 10 years, the organization is saddled
with hundreds of programs and applications that are very difficult to maintain
218 Part Two Information Technology Infrastructure
and manage. The resulting problems are data redundancy and inconsistency,
program-data dependence, inflexibility, poor data security, and an inability to
share data among applications.
Data Redundancy and Inconsistency
Data redundancy is the presence of duplicate data in multiple data files so
that the same data are stored in more than one place or location. Data redundancy occurs when different groups in an organization independently collect
the same piece of data and store it independently of each other. Data redundancy wastes storage resources and also leads to data inconsistency, where
the same attribute may have different values. For example, in instances of
the entity COURSE illustrated in Figure 6.1, the Date may be updated in
some systems but not in others. The same attribute, Student_ID, may also
have different names in different systems throughout the organization. Some
systems might use Student_ID and others might use ID, for example.
Additional confusion might result from using different coding systems
to represent values for an attribute. For instance, the sales, inventory, and
FIGURE 6.1 THE DATA HIERARCHY
A computer system organizes data in a hierarchy that starts with the bit, which represents either a 0
or a 1. Bits can be grouped to form a byte to represent one character, number, or symbol. Bytes can be
grouped to form a field, and related fields can be grouped to form a record. Related records can be
collected to form a file, and related files can be organized into a database.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 219
manufacturing systems of a clothing retailer might use different codes to
represent clothing size. One system might represent clothing size as “extra
large,” whereas another might use the code “XL” for the same purpose. The
resulting confusion would make it difficult for companies to create customer
relationship management, supply chain management, or enterprise systems
that integrate data from different sources.
Program-data dependence refers to the coupling of data stored in files and the
specific programs required to update and maintain those files such that changes
in programs require changes to the data. Every traditional computer program
has to describe the location and nature of the data with which it works. In a
traditional file environment, any change in a software program could require a
change in the data accessed by that program. One program might be modified
from a five-digit to a nine-digit zip code. If the original data file were changed
from five-digit to nine-digit zip codes, then other programs that required the
five-digit zip code would no longer work properly. Such changes could cost
millions of dollars to implement properly.
Lack of Flexibility
A traditional file system can deliver routine scheduled reports after extensive
programming efforts, but it cannot deliver ad hoc reports or respond to unanticipated information requirements in a timely fashion. The information required
by ad hoc requests is somewhere in the system but may be too expensive to
FIGURE 6.2 TRADITIONAL FILE PROCESSING
The use of a traditional approach to file processing encourages each functional area in a corporation
to develop specialized applications. Each application requires a unique data file that is likely to be a
subset of the master file. These subsets of the master file lead to data redundancy and inconsistency,
processing inflexibility, and wasted storage resources.
220 Part Two Information Technology Infrastructure
retrieve. Several programmers might have to work for weeks to put together the
required data items in a new file.
Because there is little control or management of data, access to and dissemination of information may be out of control. Management may have no way of
knowing who is accessing or even making changes to the organization’s data.
Lack of Data Sharing and Availability
Because pieces of information in different files and different parts of the
organization cannot be related to one another, it is virtually impossible for
information to be shared or accessed in a timely manner. Information cannot
flow freely across different functional areas or different parts of the organization. If users find different values of the same piece of information in two
different systems, they may not want to use these systems because they cannot
trust the accuracy of their data.
6.2 WHAT ARE THE MAJOR CAPABILITIES OF
DATABASE MANAGEMENT SYSTEMS (DBMS) AND
WHY IS A RELATIONAL DBMS SO POWERFUL?
Database technology cuts through many of the problems of traditional file
organization. A more rigorous definition of a database is a collection of data
organized to serve many applications efficiently by centralizing the data and
controlling redundant data. Rather than storing data in separate files for each
application, data appears to users as being stored in only one location. A single
database services multiple applications. For example, instead of a corporation
storing employee data in separate information systems and separate files for
personnel, payroll, and benefits, the corporation could create a single common
human resources database.
DATABASE MANAGEMENT SYSTEMS
A database management system (DBMS) is software that permits an
organization to centralize data, manage them efficiently, and provide access
to the stored data by application programs. The DBMS acts as an interface
between application programs and the physical data files. When the application program calls for a data item, such as gross pay, the DBMS finds this item
in the database and presents it to the application program. Using traditional
data files, the programmer would have to specify the size and format of each
data element used in the program and then tell the computer where they
The DBMS relieves the programmer or end user from the task of understanding where and how the data are actually stored by separating the logical
and physical views of the data. The logical view presents data as they would be
perceived by end users or business specialists, whereas the physical view shows
how data are actually organized and structured on physical storage media.
The database management software makes the physical database available
for different logical views required by users. For example, for the human
resources database illustrated in Figure 6.3, a benefits specialist might require
a view consisting of the employee’s name, social security number, and health
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 221
insurance coverage. A payroll department member might need data such as the
employee’s name, social security number, gross pay, and net pay. The data for
all these views are stored in a single database, where they can be more easily
managed by the organization.
How a DBMS Solves the Problems of the Traditional
A DBMS reduces data redundancy and inconsistency by minimizing isolated files
in which the same data are repeated. The DBMS may not enable the organization
to eliminate data redundancy entirely, but it can help control redundancy. Even
if the organization maintains some redundant data, using a DBMS eliminates
data inconsistency because the DBMS can help the organization ensure that
every occurrence of redundant data has the same values. The DBMS uncouples
programs and data, enabling data to stand on their own. The description of
the data used by the program does not have to be specified in detail each time
a different program is written. Access and availability of information will be
increased and program development and maintenance costs reduced because
users and programmers can perform ad hoc queries of the database for many
simple applications without having to write complicated programs. The DBMS
enables the organization to centrally manage data, their use, and security. Datasharing throughout the organization is easier because the data are presented
to users as being in a single location rather than fragmented in many different
systems and files.
Contemporary DBMS use different database models to keep track of entities,
attributes, and relationships. The most popular type of DBMS today for PCs
as well as for larger computers and mainframes is the relational DBMS.
Relational databases represent data as two-dimensional tables (called relations).
Tables may be referred to as files. Each table contains data on an entity and its
attributes. Microsoft Access is a relational DBMS for desktop systems, whereas
FIGURE 6.3 HUMAN RESOURCES DATABASE WITH MULTIPLE VIEWS
A single human resources database provides many different views of data, depending on the
information requirements of the user. Illustrated here are two possible views, one of interest to a
benefits specialist and one of interest to a member of the company’s payroll department.
222 Part Two Information Technology Infrastructure
DB2, Oracle Database, and Microsoft SQL Server are relational DBMS for large
mainframes and midrange computers. MySQL is a popular open source DBMS.
Let’s look at how a relational database organizes data about suppliers and
parts (see Figure 6.4). The database has a separate table for the entity SUPPLIER
and a table for the entity PART. Each table consists of a grid of columns and
rows of data. Each individual element of data for each entity is stored as a
separate field, and each field represents an attribute for that entity. Fields in
a relational database are also called columns. For the entity SUPPLIER, the
supplier identification number, name, street, city, state, and zip code are stored
as separate fields within the SUPPLIER table and each field represents an
attribute for the entity SUPPLIER.
The actual information about a single supplier that resides in a table is called
a row. Rows are commonly referred to as records, or in very technical terms, as
tuples. Data for the entity PART have their own separate table.
The field for Supplier_Number in the SUPPLIER table uniquely identifies
each record so that the record can be retrieved, updated, or sorted. It is called a
key field. Each table in a relational database has one field that is designated as
its primary key. This key field is the unique identifier for all the information
FIGURE 6.4 RELATIONAL DATABASE TABLES
A relational database organizes data in the form of two-dimensional tables. Illustrated here are tables for the entities SUPPLIER and PART
showing how they represent each entity and its attributes. Supplier_Number is a primary key for the SUPPLIER table and a foreign key for
the PART table.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 223
Get help with this assignment and many more by clicking here, or simply subscribing to