Role of the Database Administrator

Typically there are three types of users for a DBMS. They are :

*The End User who uses the application. Ultimately, this is the user who actually puts the data in the system into use in business. This user need not know anything about the organization of data in the physical level. She also need not be aware of the complete data in the system. She needs to have access and knowledge of only the data she is using.

*The Application Programmer who develops the application programs. She has more knowledge about the data and its structure since she has manipulate the data using her programs. She also need not have access and knowledge of the complete data in the system.

*The Database Administrator (DBA) who is like the super-user of the system. The role of the DBA is very important and is defined by the following functions.

Defining the Schema

*The DBA defines the schema which contains the structure of the data in the application. The DBA determines what data needs to be present in the system ad how this data has to be represented and organized.

*Liaising with Users

The DBA needs to interact continuously with the users to understand the data in the system and its use.

*Defining Security & Integrity Checks

The DBA finds about the access restrictions to be defined and defines security checks accordingly. Data Integrity checks are also defined by the DBA.

*Defining Backup / Recovery Procedures

The DBA also defines procedures for backup and recovery. Defining backup procedures includes specifying what data is to backed up, the periodicity of taking backups and also the medium and storage place for the backup data.

*Monitoring Performance

The DBA has to continuously monitor the performance of the queries and take measures to optimize all the queries in the application.

Approaches to Data Management

File-Based Systems:
Conventionally, before the Database systems evolved, data in software systems was stored in and represented using flat files.

Database Systems:

Database Systems evolved in the late 1960s to address common issues in applications handling large volumes of data which are also data intensive. Some of these issues could be traced back to the following disadvantages of File-based systems.

Drawbacks of File-Based Systems


As shown in the figure, in a file-based system, different programs in the same application may be interacting with different private data files. There is no system enforcing any standardized control on the organization and structure of these data files.

Data Redundancy and Inconsistency

Since data resides in different private data files, there are chances of redundancy and resulting inconsistency. For example, in the above example shown, the same customer can have a savings account as well as a mortgage loan. Here the customer details may be duplicated since the programs for the two functions store their corresponding data in two different data files. This gives rise to redundancy in the customer's data. Since the same data is stored in two files, inconsistency arises if a change made in the data in one file is not reflected in the other.

Unanticipated Queries

In a file-based system, handling sudden/ad-hoc queries can be difficult, since it requires changes in the existing programs.
Data Isolation

Though data used by different programs in the application may be related, they reside in isolated data files.

Concurrent Access Anomalies

In large multi-user systems the same file or record may need to be accessed by multiple users simultaneously. Handling this in a file-based systems is difficult.

Security Problems

In data-intensive applications, security of data is a major concern. Users should be given access only to required data and not the whole database. In a file-based system, this can be handled only by additional programming in each application.

Integrity Problems

In any application, there will be certain data integrity rules which needs to be maintained. These could be in the form of certain conditions/constraints on the elements of the data records. In the savings bank application, one such integrity rule could be “Customer ID, which is the unique identifier for a customer record, should be non-empty”. There can be several such integrity rules. In a file-based system, all these rules need to be explicitly programmed in the application program.

It may be noted that, we are not trying to say that handling the above issues like concurrent access, security, integrity problems, etc., is not possible in a file-based system. The real issue was that, though all these are common issues of concern to any data-intensive application, each application had to handle all these problems on its own. The application programmer needs to bother not only about implementing the application business rules but also about handling these common issues.

Types of Database Systems


Database Systems can be catagorised according to the data structures and operators they present to the user. The oldest systems fall into inverted list, hierarchic and network systems. These are the pre-relational models.

* In the Hierarchical Model, different records are inter-related through hierarchical or tree-like structures. A parent record can have several children, but a child can have only one parent. In the figure, there are two hierarchies shown - the first storing the relations between CUSTOMER, ORDERS, CONTACTS and ORDER_PARTS and the second showing the relation between PARTS, ORDER_PARTS and SALES_HISTORY. The many-to-many relationship is implemented through the ORDER_PARTS segment which occurs in both the hierarchies. In practice, only one tree stores the ORDER_PARTS segment, while the other has a logical pointer to this segment. IMS (Information Management System) of IBM is an example of a Hierarchical DBMS.






*In the Network Model, a parent can have several children and a child can also have many parent records. Records are physically linked through linked-lists. IDMS from Computer Associates International Inc. is an example of a Network DBMS.






* In the Relational Model, unlike the Hierarchical and Network models, there are no physical links. All data is maintained in the form of tables consisting of rows and columns. Data in two tables is related through common columns and not physical links or pointers. Operators are provided for operating on rows in tables. Unlike the other two type of DBMS, there is no need to traverse pointers in the Relational DBMS. This makes querying much more easier in a Relational DBMS than in the the Hierarchical or Network DBMS. This, in fact, is a major reason for the relational model to become more programmer friendly and much more dominant and popular in both industrial and academic scenarios. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular Relational DBMSs.


The recent developments in the area have shown up in the form of certain object and object/relational DBMS products. Examples of such systems are GemStone and Versant ODBMS. Research has also proceeded on to a variety of other schemes including the multi-dimensional approach and the logic-based approach.


*The External Level represents the collection of views available to different end-users.
*The Conceptual level is the representation of the entre information content of the database.
*The Internal level is the physical level which shows how the data data is stored, what are the representation of the fields etc.

Storing Images and Surfaces within Spatial DBMS

Manifold provides two ways of storing images and surfaces within a DBMS:

Oracle GeoRaster - Connecting via OCI (Oracle's native Oracle Call Interface) to an Oracle database that has GeoRaster capability (Oracle Spatial or Oracle Enterprise) allows us to store images within Oracle's native GeoRaster data type. See the Example: Storing an Image in Oracle topic.

Generic image storage: Connecting via ODBC or OLE DB to almost any DBMS that allows binary storage allows us to store images and surfaces (which are stored as images with an additional Height channel). Images are stored as tiles, with the export process allowing choice of a tile size. Exporting an image to such a data source creates a table with a record for each image tile. Image tiles are stored in compressed or uncompressed BGRA format. Importing or linking images from such a data source automatically reassembles tiles. Surfaces cannot be linked, but can be imported by importing the Height channel of the image that is stored for them into the database. See the Example: Storing an Image in Manifold Spatial DBMS and Example: Storing a Surface in Manifold Spatial DBMS topics.

Manifold uses Database Console to connect to a database storing images to import or link to a stored image. Database console will recognize tables containing image tiles and will display them as images, allowing us to import or link to the image by using the database console's Import or Link buttons. We can also import individual channels of such images as surfaces.

Storing Drawings within Spatial DBMS

Manifold features integrated read / write / edit, multiuser connections to a variety of spatial DBMS packages. In general, operation is fully automatic: when a Manifold drawing is uploaded to a spatial DBMS the Manifold geometry will automatically be converted into the native DBMS geometry type. When a drawing is linked or imported from a spatial DBMS source the native DBMS geometry will be automatically cast into Manifold geometry.

Manifold provides two ways of storing drawings within spatial DBMS:

Native storage within a spatial DBMS::- Examples are Oracle Spatial, IBM DB2, Microsoft SQL Server 2008 (CTP5 or greater) or PostgreSQL using the native geometry type specified by the DBMS. Connecting to one of these spatial DBMS packages using the "native" connection, such as OCI for Oracle, tells Manifold to automatically convert local Manifold geometry for drawings into whatever the target spatial DBMS users as its own native geometry type.

Non-native storage within almost any DBMS::

Manifold can confer spatial DBMS capability to almost any DBMS. When using Manifold-managed spatial storage, we have the choice of what geometry type we would like to use. Examples are storing drawings as tables within, say, MySQL or SQL Server using any of the binary geometry types supported by Manifold, such as GeomWKB (OGC WKB), GeomSHP (legacy ESRI geodatabases), or GEOMETRY (Manifold's own high performance geometry type). Normally the choice will be made between Manifold Geometry type or OGC WKB type. In addition, Manifold can establish a generic spatial index. The spatial index is used together with the binary data to allow "ordinary" DBMS packages to function as a spatial DBMS. A special case of Manifold-managed storage is the use of the Manifold Spatial Extender for SQL Server 2005 to provide especially fast spatial DBMS capability within SQL Server 2005. See the Example: Storing a Drawing in Manifold Spatial DBMS topic.

Some New things about DBMS

They are used to share data on a network. DBMS standards for distributed databases:
SQL - Structured Query Language is a database access language. It is used by most client/server database applications.
ODBC - Open Database Connectivity (ODBC) from Microsoft lets application developers integrate database connections in applications. It is an application programming interface (API). ODBC drivers convert an application's query into SQL and send it to the database engine program.
DRDA - Distributed Relational Database Architecture is from IBM.

When information is processed in a distributed database, it is called a transaction. The two phases of a transaction are:
Write or Update - The data is temporarily updated. An abort can cancel what this phase did by removing the changed data from a temporary storage area.
Commit - The changed data is made permanent in the database.

Databases store multiple copies of the data which is called replication. They must be sure the various copies of the database on various servers is accurate with identical data. Data is also partitioned into smaller blocks of data.

Database Management Systems (DBMS)

They are used to share data on a network. DBMS standards for distributed databases:
SQL - Structured Query Language is a database access language. It is used by most client/server database applications.
ODBC - Open Database Connectivity (ODBC) from Microsoft lets application developers integrate database connections in applications. It is an application programming interface (API). ODBC drivers convert an application's query into SQL and send it to the database engine program.
DRDA - Distributed Relational Database Architecture is from IBM.

When information is processed in a distributed database, it is called a transaction. The two phases of a transaction are:
Write or Update - The data is temporarily updated. An abort can cancel what this phase did by removing the changed data from a temporary storage area.
Commit - The changed data is made permanent in the database.

Databases store multiple copies of the data which is called replication. They must be sure the various copies of the database on various servers is accurate with identical data. Data is also partitioned into smaller blocks of data.

Hack Technology

Hack has several meanings in the technology and computer science fields: a clever or quick fix to a computer program problem; a clumsy or inelegant solution to a problem; or a modification of a program or device to give the user access to features that were otherwise unavailable to them.

Technorati Profile