Suprtool

SAPdb - Enterprise-Class, Open Source Database

By Aaron Holmes, Robelle Junior Programmer

In 2001 the source code for SAPdb was released to the general public, completing SAP's roadmap for releasing their database software under a GNU/GPL user license. By doing this SAP has brought a fully-featured, enterprise-class database system to the open source community. SAP continues to update and support SAPdb through their website (www.sapdb.org) and an active mailing list (www.sapdb.org/sap_db_contact.htm).

Enough with the history lesson for now - my real goal is to dive into SAPdb and see what it offers in real world terms. By the end of this paper the following will be addressed:

  1. Brief look at some of the features found in SAPdb
  2. Install and configure SAPdb
  3. Create a new database and table set with test data
  4. Access SAPdb with the online web interface package
  5. Compare SAPdb with other database engines (mySQL, PostgreSQL, etc.)

Features Found in SAPdb

SAPdb is designed for the business enterprise, with 24x7 uptime, scalability and high performance in mind. There are no limitations on database sizes or on the number of users. SAPdb is ACID compliant (fully supports ISO-SQL 92 Standards) and includes all RDBMS and enterprise features expected in an open DBMS such as Views, triggers, foreign keys, constraints of various kinds, stored procedures, versioning, hot backups, etc. This allows SQL applications written in other databases to be easily portable to SAPdb. SAPdb also includes a C/C++ precompiler and interfaces for Perl, Python, and PHP script languages. For COBOL and Fortran, you will need to consult your compiler source for an SQL pre-compiler. ODBC and JDBC drivers are supported for connectivity to Windows based and Java based applications respectively.

SAP also distributes a web-based interface for administering databases and performing SQL duties.

Installation and Configuration

SAPdb is available for download from the SAPdb website (www.sapdb.org). It's a free download and various platforms are supported (Linux, Solaris, WindowsNT/2000, HP-UX, and a few others). Also available from the webpage are the web interface package tools and the ODBC/JDBC drivers. There are also a few more tools available including SQL Studio for Windows users.

My initial attempts at installing SAPdb were on a test box running FreeBSD 4.5 Stable. However I could not get SAPdb to compile properly, even under the FreeBSD Linux emulation support. After searching around for a bit I found some information regarding a FreeBSD port that was underway. Unfortunately I wasn't able to find much more information than that. The status of the project/availability still remains somewhat of a mystery to me.

Instead of focusing too much time on the FreeBSD port, I used my laptop as the test computer, running Mandrake 8.1 LINUX on a Pentium II 233 cpu, with 160mb of ram. After downloading the Linux(Intel) SAPdb server source files and web interface files, I followed the instructions included in the online help files (linked off sapdb.org - in the menu on the right).

I will make a note here while I am mentioning the documentation - the manuals for SAPdb are a bit messy, and sometimes hard to follow. I personally found the online documentation much easier to use than downloading all of the PDF's from sapdb.org. I found myself often searching for a particular answer and having to dig through the various manuals one by one to find it.

The installation steps were fairly simple. After extracting the tar file I used the included SDBINST program to install SAPdb with default configuration and path settings. After this was finished I installed the web tools interface package, again with the aid of the documentation.

I left everything as default, except I changed the port setting to 9999 in WebAgent73.ini (which is found in the /web/config directory). By leaving everything as default I had chosen to run the web tools software under the included SAPdb web server rather than my own web server (Apache, etc.). To run the web tools web server I executed wahttp (found in /web/pgm). This starts the server and allows access to the database from the web (remember to do an 'export LD_LIBRARY_PATH=/web/lib' command before running wahttp, otherwise wahttp will complain about missing library files).

It is also possible to integrate web tools with an existing Apache setup. This requires a few changes in the httpd.conf file that are explained in the installation manual for the SAPdb Web Tools.

With this finished, I was ready to test out the new database software. What I needed now was a new database. As an aid, SAP includes a test script which automatically makes a test database with users for you.

Creating the New Database and Tables

The easiest way to create a test database is to run the included script named create.demo_db.sh (which is in the /depend/misc directory). Running this script creates a new database named TST, plus new users. The first username is DBM, with a password of DBM, is the database administration (DBM has privileges that allow him to use the SAPdb dbmcli tool). Remember, DMB does not perform SQL queries and functions on the database. A separate username TEST, with a password of TEST, is created for SQL purposes.

I connected to the new database via the web SQL interface by pointing my browser to http://localhost:9999/websql (this is the URL for the SQL tool - /webdbm gives you the database administration tool). Using the database name of TST, username TEST, and password TEST, I logged into the SQL front end for the database. The interface is fairly clean and easy to use. One complaint I have is that the web tools uses JavaScript and CSS, which can cause some issues in older browsers.

With the browser running the SQL interface, I used the 'show table' SQL statement to see the tables in the TST database. SAPdb returned a listing of all the tables associated with the database, including the system and configuration tables. There's quite a few of them and if you look under the owner column you can see who they belong to. DBA and DOMAIN own the tables for database administration and configuration, and SYS owns the tables for system settings.

Since none of these tables belong to user TEST, we shouldn't be mucking around with them. Creating a new table will give us something to play with. Using the standard 'create table {…}' SQL syntax, I was able to create a new table called alumni. I won't go into detail about each of the SQL commands that I used. If you do need help with them, you can refer to the "SQL statement: overview" chapter in the SAPdb reference manual. Next, I put in a small amount of test data using 'INSERT'. This also went smoothly, so I decided to try loading bulk data from a comma-delimited file.

Here things became a little tricky… I had previously used the 'load data' statement in mySQL and the 'copy' command in PostgreSQL, but I wasn't sure where to look for the SAPdb implementation. The difficulty was in finding the proper section of the documentation for importing data from external file sources. I finally found a tool named repmcli, which is included with SAPdb in the /depend/bin folder. The documentation states that it can load and unload data from and into external files.

With a little patience I managed to load the test data into the table, although I had to edit my comma delimited file to put double quotes around each data field. Double quotes are a SAPdb default, and can be changed to a different character if necessary. If you are using Robelle's STExport tool to export the data, use the 'quote double' command.

The repmcli command includes a rich set of features; it can accomplish some fairly complicated things by executing a batch file. The batch file includes specific commands and SQL statements which tell SAPdb how to load the data. In my case I used a simple batch file, which I created in VI and named command.dat with the following code:

	FASTLOAD TABLE alumni
		Id 1
		Birthdate 2
		Title 3
	INFILE 'testdata.txt'

In this batch file, alumni is the test table, and testdata.txt is our comma delimited data file with double quotes surrounding each field entry.

SAPdb Evaluation/Comparison with Other Databases

SAPdb is the number one open source database in respect to features, and is definitely geared towards the enterprise server market. The result is a very high learning curve for SAPdb. I often found myself spending far too much time just looking for a certain command that I knew must exist, but could not easily find. The documentation is there, but is hard to navigate and I could not find any third-party books written about the database engine of SAP.

I find it difficult to compare SAPdb to mySQL. The two offer separate solutions to different problems. The learning curve with mySQL is far smaller than that of SAPdb. However mySQL does not have the features of SAPdb. Ease of use and availability is the reason mySQL is such a popular choice for backend database operations. For most web applications it does its job perfectly and with mySQL max and mySQL v4.x on the way and support for transactions and innodb tables, mySQL will offer even more bang for the buck. Performance-wise mySQL is very fast with small sites, in fact one of the fastest databases available. However as soon as you increase the number of concurrent users, mySQL starts to slow down exponentially and gets blown away by both SAPdb and PostgreSQL. If your site is small and the need for advanced SQL functions is not needed, mySQL will more than fit the bill. When you weigh in the fact that the average programmer could easily pick it up in a day or two, mySQL begins to look especially attractive.

Comparing SAPdb with PostgreSQL is a little bit easier. Both databases have many of the advanced features found in enterprise level database solutions, and focus on features/performance. PostgreSQL has a slightly higher learning curve than mySQL, but is nowhere near as hard to learn as SAPdb. It also comes with online documentation and many third-party publishers have books relating directly to PostgreSQL. For my needs I found that I could do everything I had to in PostgreSQL, and in half the time or less than it took me in SAPdb. For a user discussion of SAPdb vs PostgreSQL, read this Slashdot thread.

The real target audience for SAPdb is in the enterprise market, where behemoths like Oracle have dominated the market for years. SAPdb strives to offer all of the features available in Oracle with high performance, but for free. This sounds a bit too good to be true, and at this current moment I think SAPdb still has to make some progress if they want to convince Oracle administrators and users that their product is not only superior, but also free.

While SAPdb supports many of the features found in Oracle, I would be a little hesitant before replacing any existing Oracle backends with SAPdb. The reason I come to this conclusion is that Oracle is built on mature code, which has been used in large scale production sites for many moons. Oracle has a larger group of users who have been demanding top notch stability and performance from their databases since day one. SAPdb on the other hand is a bit of a newcomer. Although it has been around for some time (originally Adabas D), SAPdb really hasn't been used for much other than the SAP R/3 framework. Many of the bugs still haven't been found and are only being exposed now as SAPdb grows in popularity and maturity. Your best bet by far is to try out SAPdb on a test server, with test data and see if it offers what you really need.


SAPdb Datatypes

If you are thinking of using SAPdb as your migration target, you need to understand the datatypes available to you. They are similar to those in Oracle, SQL Server and other SQL databases.

Character Datatypes:

Varchar[n]: Allows for variable data length. You can set the maximum field size for the column with n. n can be between 0 and 8000, or left off altogether (defaults to 1).

Char[n]: Data is stored with a fixed length of n. n can take on any value between 0 and 8000, and if not supplied it is assumed to be 1.

If you are migrating a TurboIMAGE X or U field, you select either Varchar or Char. When you return the SQL data into your COBOL program, you must specify a fixed-size field, so either datatype will work. However, many people use Varchar by preference because it allows the database the option of optimizing the space for the values.

Numeric Datatypes:

Int[]: This data type is the same as FIXED(10.0). Its permitted values are between -2147483648 and 2147483647.

SmallInt[]:This data type is the same as FIXED(5.0). Its permitted values are between -32768 and 32767.

Float[p]: A column with a floating point number with precision p (0 Fixed [p,s]: A column is created with a fixed point number with precision p and with s number of decimal places. If s is not supplied it is assumed there are 0 decimal places. P can take values from 0 to 38 and s must be less than or equal to p.

When converting a TurboIMAGE I1 field, select Smallint. For I2 select Int. Notice that there is no 8-byte integer type, so I4 (PIC S9(10) COMP) must be converted to Fixed in the database and COMP-3 (packed decimal) in COBOL. For J and K fields, use the same types as for I.

TurboIMAGE E types convert to Float, while R types are deprecated (this is the old floating point of the Classis HP 3000). N and P convert to Fixed.

Date and Time Datatypes:

For a more in-depth look at the following datatypes see the section Date and Time Formats later on.

Date: Allows you to store special character strings which describe a date value.

Time: Allows you to store special character strings which describe a time value.

Timestamp: A timestamp consists of a date and time value and a microsecond specification.

Boolean Datatype:

Boolean: Column accepts values of TRUE, FALSE, and NULL only.

Other Datatypes:

In addition to the data types defined above, the following data types are permitted in a column definition and are mapped as follows to the data types below:

Data Type Is Mapped To
DEC[IMAL](p,s) FIXED(p,s)
DEC[IMAL](p) FIXED(p)
DEC[IMAL] FIXED(5)
BINARY(p) FIXED(p)
FLOAT FLOAT(16)
FLOAT(39..64) FLOAT(38)
DOUBLE PRECISION FLOAT(38)
REAL(p) FLOAT(p)
REAL FLOAT(16)
LONG VARCHAR LONG
SERIAL FIXED(10) DEFAULT SERIAL
SERIAL(p) FIXED(10) DEFAULT SERIAL(p)

Adding UNICODE to the column declaration:

You can add the UNICODE clause to your character datatypes (available for CHAR, VARCHAR and LONG[VARCHAR]) to allow SAPdb to map various presentation codes to the Unicode format. This comes with the drawback of requiring approximately twice as much memory for the column and decreases the maximum value for n to 4000. However, it allows you to store European and Asian characters in a standard way.

Adding LONG to the column declaration:

In special cases it may be beneficial to use a LONG datatype. A LONG column is a data type that contains a sequence of characters (LONG varchar/char) or bytes (LONG byte) of any length to which no functions can be applied. LONG columns cannot be compared to one another. The contents of LONG columns cannot be compared to character strings or other data types. A LONG column can be useful when wanting to store JPG/GIF images (by using a LONG BYTE specified column) or very large portions of data (ie. LONG varchar columns can hold up to 2gb of data) and resembles the BLOB datatype which is found in other SQL database systems.

Memory Requirements of Columns based on Datatypes:

Data Type Memory Requirements of a Column Value in Bytes for This Data Type
FIXED(p,s) (p+1) DIV 2 + 2
FLOAT(p) (p+1) DIV 2 + 2
BOOLEAN 2
DATE 9
TIME 9
TIMESTAMP 21
LONG 9
CHAR(n); n<=30 n+1
CHAR(n); 30<n<=254; key column n+1
CHAR(n); 30<n<=254; not key column n+2
CHAR(n); 254<n n+3
CHAR(n) UNICODE; n<=15 2*n+1
CHAR(n) UNICODE; 15<n<=127; key column 2*n+1
CHAR(n) UNICODE; 15<n<=127; not key column 2*n+2
CHAR(n) UNICODE; 127<n 2*n+3
VARCHAR(n); 30<n<=254; key column n+1
VARCHAR(n); 30<n<=254; not key column n+2
VARCHAR(n); 254<n n+3
VARCHAR(n) UNICODE; 15<n<=127; key column 2*n+1
VARCHAR(n) UNICODE; 15<n<=127; not key column; 2*n+2
VARCHAR(n) UNICODE; 127<n 2*n+3

Date and Time Formats

The date and time format (datetimeformat) specifies the representation of date values, time values, and timestamp values. It determines the format in which the date, time, and timestamp values may be represented in SQL statements and the way in which results are to be displayed. The date and time format is determined when the database system is installed.

Users can change the date and time format for the current session by setting the relevant parameters in the database tools or by specifying the corresponding parameters when using programs.

There are five formats:

   <datetimeformat> ::= EUR | INTERNAL | ISO | JIS | USA
The ISO date and time format is used by ODBC and JDBC applications and cannot be replaced with a different date and time format.

Date value

'YYYY' Four-digit year format
'MM' Two-digit month format (01-12)
'DD' Two-digit day format (01-31)

Date Value Examples:

Format General Format Example
EUR 'DD.MM.YYYY' '23.01.1999'
INTERNAL 'YYYYMMDD' '19990123'
ISO/JIS 'YYYY-MM-DD' '1999-01-23'
USA 'MM/DD/YYYY' '01/23/1999'

In all formats, with the exception of INTERNAL, leading zeros may be omitted in the identifiers for the month and day.

If you are transforming dates from a TurboIMAGE database to load a SAPdb table, you may have to adjust the date format to make it match the format you selected for SAPdb. For example, if your dates are stored in TurboIMAGE as J2 (binary) with a format of YYYYMMDD and your SAPdb format is USA, then you could do the conversion with these Suprtool and STExport commands:

Suprtool Commands:

Get salesdetail
Item salesdate,date,yyyymmdd
Extract customer,salesdate,item,quantity
Output tempsales,temp,link
Xeq

STExport Commands:

input tempsales
quote double
date  mmddyyyy "/"
output sapdbsales,temp
xeq

Time value

'HHHH' Four-digit hour format
'HH' Two-digit hour format
'MM' Two-digit minute format (00-59)
'SS' Two-digit second format (00-59)

Time Value Examples:

Format General Format Example
EUR 'HH.MM.SS' '14.30.08'
INTERNAL 'HHHHMMSS' '00143008'
ISO/JIS 'HH:MM:SS' '14:30:08'
USA 'HH:MM AM (PM)' '2:30 PM'

In all time formats, the identifier of the hour must consist of at least one digit. In the USA time format, the minute identifier can be omitted completely. In all the other formats, with the exception of INTERNAL, the minute and second identifiers must comprise at least one digit.

Timestamp value

'YYYY' Four-digit year format
'MM' Two-digit month format (01-12)
'DD' Two-digit day format (01-31)
'HH' Two-digit hour format (0-24)
'MM' Two-digit minute format (00-59)
'SS' Two-digit second format (00-59)
'MMMMMM' Six-digit microsecond format

Timestamp value Examples:

Format General Format Example
EUR/JIS/USA 'YYYY-MM-DD-HH.MM.SS.MMMMMM' '1999-01-23-14.30.08.456234'
ISO 'YYYY-MM-DD HH:MM:SS.MMMMMM' '1999-01-23 14:30:08.456234'
INTERNAL 'YYYYMMDDHHMMSSMMMMMM' '19990123143008456234'

The microsecond identifier can be omitted in all timestamp formats. In all formats, with the exception of INTERNAL, the month and day identifiers must consist of at least one digit. For INTERNAL the month and day identifiers must have exactly two digits.

SAPdb Resources:

SAPdb Online Documentation: http://www.sapdb.org/sap_db_documentation.htm

Additional reading on the Robelle web site: