flibs/odbc - Interface to ODBC
The ODBC module provides a Fortran interface to the Open Database Connectivity system or ODBC. The interface has been implemented in such a way, that you can use a high-level interface for common tasks, such as inserting data into a database and querying the contents, as well as lower-level functionality, accessible via SQL statements, for instance.
To this end the module defines a set of routines and functions as well as several derived types to hide the low-level details.
In its current form, it does not provide a full Fortran API to all the functionality offered by SQLite, but it should be quite useable.
Note: This interface has been modelled after the Fortran SQLite interface in this same project. Because ODBC is not a database management system in its own right, but instead an common interface to various database systems, several additional routines are available, such as odbc_get_driver, that have no equivalent within the context of SQLite.
Note: While ODBC is intended to provide a generic interface to database management systems, there are still a number of issues that you should be aware that depend on the operating system and the specific database management system.
These issues are documented in PLATFORM ISSUES.
The following derived types are defined:
Variables of this type are used to hold the connection to the database or databases. They are created by the subroutine odbc_open
The contents are valid until the database is closed (via odbc_close).
Variables of this type hold prepared statements, the common method for database management systems to efficiently execute SQL statements.
To provide easy communication with the database, ODBC_COLUMN can hold values of different types. This means you can use a single routine and variable to transfer strings, integers or reals to and from the database.
The first two derived types are "opaque", that is they are used only to communicate between the application and the database library and there is information of interest to be gotten from them.
The third type is rather crucial to the working of the implementation: By setting the properties of an ODBC_COLUMN variable you put data into the database or you can retrieve data from the database. See the example below for how this works.
There are a number of routines that are meant to make this easier:
Set the properties of a column
The variable that holds the information on the column
Name of the column in the table to which it belongs or will belong
Type of the column: one of ODBC_INT, ODBC_REAL, ODBC_DOUBLE, ODBC_CHAR or ODBC_BINARY (see PLATFORM ISSUES).
Length of a character-valued column (defaults to 20 characters) or a BLOB-type column.
Set the properties of a column when constructing a SELECT query. The "function" argument, if present, is a string representing an SQL function like count or max.
The variable that holds the information on the column
Name of the column in the table to which it belongs or will belong
Type of the column: one of ODBC_INT, ODBC_REAL, ODBC_DOUBLE, ODBC_CHAR or ODBC_BINARY.
Length of a character-valued column (defaults to 20 characters)
Name of the SQL function to perform on the values.
Set the value of a column
The variable that holds the information on the column
The new value for the column. The type of the value that is passed can be integer, real, double precision real or character string.
Note: currently there is no conversion from the type of value that is stored to the type of the actual variable that is passed to the routine. If you ask for an integer and the column holds a real, then the result is undefined. Check the type with the value of the flag "type_set". (This is one of the things that should be improved)
Get the value of a column
The variable that holds the information on the column
The value stored in the column. The type of the value that is passed can be integer, real, double precision real or character string.
The odbc module currently provides the following functions:
Open a database by data set name or by file name and driver and store the connection for later use.
The name of the data set (DSN, as known to the ODBC system) or the database file to be opened.
The name of the driver, currently one of the parameters ODBC_MSACCESS, ODBC_MSEXCEL, ODBC_SQLITE or ODBC_POSTGRES (see PLATFORM ISSUES). If left out, the name is supposed to be a data set name (DSN).
Variable to identify the database connection
Open a connection to the database of choice via a full connection string. This routine is useful if you want specific connection parameters or if the driver is not directly supported.
The connection string to be used. It must contain all information required (see the documentation on the particular ODBC driver).
Variable to identify the database connection
Close the database connection. Simply an interface to the corresponding C function.
Variable identifying the database connection
Retrieve whether the previous command resulted in an error or not. Returns true if so, otherwise false.
Variable identifying the database connection
Set the type of support for BLOBs (see PLATFORM ISSUES). Use this if you intend to use BLOBs (binary large objects).
Variable identifying the database connection.
Identify how the dabase management system supports BLOBs:
ODBC_PLAIN_BLOB - (default) the database system uses the keyword "BLOB" to indicate binary large objects and the ODBC driver simply returns a set of bytes.
ODBC_POSTGRES_HEX - the database system (notably PostgreSQL) uses the keyword "BYTEA" to indicate binary large objects and the ODBC driver returns a hexdecimally encoded string instead of a set of bytes.
Retrieve the last error message as a string of at most 80 characters.
Variable identifying the database connection or the statement that produced the error.
Print the last error messages to the screen or to a file
Variable identifying the database connection or the statement that produced the error.
LU-number of the file to print the messages to. If not present, the messages are printed to the screen.
Run a single SQL command
Variable identifying the database connection
String holding a complete SQL command
Start a transaction. When the corresponding routine odbc_commit is called, all changes will be made permanent. Use a transaction to gather lots of changes to the database - this is much faster than an automatic commission after each change.
Note: The database driver may or may not support this feature. Consult the documentation.
Variable identifying the database connection
Commit the changes made since the start of a transaction. This makes the changes permanent.
Variable identifying the database connection
Undo the changes made since the start a transaction. The database will be restored to the state it was in before the transaction was started.
Variable identifying the database connection
Create a new table, based on the properties of the columns. Convenience routine that constructs an SQL statement to do the actual job.
Variable identifying the database connection
Name of the table to be created
An array of the properties of the columns in the tables (name, type, ...)
Name of the column that acts as the primary key (this gets the "unique" constraint)
Delete an existing table by name. Convenience routine that constructs an SQL statement to do the actual job.
Variable identifying the database connection
Name of the table to be deleted
Prepare a SELECT query. Convenience routine that creates the SQL query and "compiles" (prepares) it for later actual execution.
Variable identifying the database connection
Name of the table to be queried
An array of the properties of the columns to be returned
A derived type used as a handle to the prepared statement
A string holding an extra clause, such as "SORT BY" or "GROUP BY"
Prepare a general SQL statement for later actual execution. The statement can be any SQL statement.
Variable identifying the database connection
The SQL statement to be prepared
A derived type used as a handle to the prepared statement
An array of the properties of the columns that will be returned by the statement. The routine returns an allocated array. You must deallocate it yourself, when it is no longer needed.
Run the prepared SQL statement for one step. The code in completion will tell whether it was successful or not. Simply an interface to the equivalent C routine.
A derived type used as a handle to the prepared statement
One of the values ODBC_DONE (success), ODBC_MISUSE or ODBC_ERROR
Reset the prepared statement so that it can be used again.
A derived type used as a handle to the prepared statement
Free all resources associated with the prepared statement.
A derived type used as a handle to the prepared statement
Retrieve the next row of a SELECT query. If the argument "finished" is set to true, the previous row was the last one.
A derived type used as a handle to the prepared statement
Set to true if the last row was retrieved.
Insert a complete new row into the table.
Variable identifying the database connection
Name of the table into which the row must be inserted
An array of values for all columns
Get the result of a query in a single two-dimensional array
NOT IMPLEMENTED YET
Variable identifying the database connection
The SQL command (query) to executed
A two-dimensional array that will be filled with the results of the SQl command. When done, you will have to deallocate it.
If there is an error, then "result" will not be allocated, and "errmsg" will contain the information about the error that occurred.
Query the structure of the table
Variable identifying the database connection
Name of the table to be inspected
An array with the properties of all columns. Deallocate it when you are done.
The following routines are specific to ODBC:
Get the first (next = .false.) or the next (next = .true.) data set name.
Whether to get the first or the next data set name
Name of the data set
Description of the data set (usually includes the driver)
Whether there is a data set name returned or not
Get the first (next = .false.) or the next (next = .true.) registered driver.
Whether to get the first or the next driver
Name of the driver
Description of the driver
Whether there is a driver name returned or not
Get information on the first (next = .false.) or the next (next = .true.) table in a database.
Whether to get the first or the next table name
Name of the table
Description of the table (at least 5 elements). The fourth element is the type of table (SYSTEM_TABLE, TABLE or VIEW).
Whether there is a driver name returned or not
To illustrate the usage of the library, here is a small example:
Store (fictitious) measurements of salinity and temperature from a CSV file in a single table of a new database.
To check that it works, retrieve the average salinity and average temperature per station and print them sorted by station name
The first part of the program simply defines the table:
allocate( column(4) ) call odbc_column_props( column(1), name(1), ODBC_CHAR, 10 ) call odbc_column_props( column(2), name(2), ODBC_CHAR, 10 ) call odbc_column_props( column(3), name(3), ODBC_REAL ) call odbc_column_props( column(4), name(4), ODBC_REAL ) call odbc_create_table( db, 'measurements', column )
The second part reads the data file and stores the data in a new row:
call odbc_begin( db ) do read( lun, *, iostat=ierr ) station, date, salin, temp if ( ierr .ne. 0 ) exit call odbc_set_column( column(1), station ) call odbc_set_column( column(2), date ) call odbc_set_column( column(3), salin ) call odbc_set_column( column(4), temp ) call odbc_insert( db, 'measurements', column ) enddo close( lun ) call odbc_commit( db )
Note that it uses a transaction (via calls to odbc_begin and odbc_commit pair), so that all the inserts can be done in one go. Inserting with autocommit is much slower, as the database file needs to be flushed very time.
The last part retrieves the data by constructing an SQL query that will actually look like:
select station, avg(salinity), avg(temperature) from measurements grouped by station order by station;
The routine odbc_prepare_select takes care of the actual construction of the above SQL query:
deallocate( column ) allocate( column(3) ) call odbc_column_query( column(1), 'station', ODBC_CHAR ) call odbc_column_query( column(2), name(3), ODBC_REAL, function='avg' ) call odbc_column_query( column(3), name(4), ODBC_REAL, function='avg' ) call odbc_prepare_select( db, 'measurements', column, stmt, & 'group by station order by station' ) write( *, '(3a20)' ) 'Station', 'Mean salinity', 'Mean temperature' do call odbc_next_row( stmt, column, finished ) if ( finished ) exit call odbc_get_column( column(1), station ) call odbc_get_column( column(2), salin ) call odbc_get_column( column(3), temp ) write( *, '(a20,2f20.3)' ) station, salin, temp enddo
The full program looks like this (see also the tests/examples directory of the Flibs project):
! csvtable.f90 -- ! Program to read a simple CSV file and put it into a ! SQLite database, just to demonstrate how the Fortran ! interface works. ! ! To keep it simple: ! - The first line contains the names of the four columns ! - All lines after that contain the name of the station ! the date and the two values. ! ! $Id: fodbc.man,v 1.1 2010/01/17 10:13:50 arjenmarkus Exp $ ! program csvtable use odbc implicit none type(ODBC_DATABASE) :: db type(ODBC_STATEMENT) :: stmt type(ODBC_COLUMN), dimension(:), pointer :: column integer :: lun = 10 integer :: i integer :: ierr character(len=40), dimension(4) :: name real :: salin real :: temp character(len=40) :: station character(len=40) :: date logical :: finished ! ! Read the CSV file and feed the data into the database ! open( lun, file = 'somedata.csv' ) read( lun, * ) name call odbc_open( 'somedata.db', db ) allocate( column(4) ) call odbc_column_props( column(1), name(1), ODBC_CHAR, 10 ) call odbc_column_props( column(2), name(2), ODBC_CHAR, 10 ) call odbc_column_props( column(3), name(3), ODBC_REAL ) call odbc_column_props( column(4), name(4), ODBC_REAL ) call odbc_create_table( db, 'measurements', column ) ! ! Insert the values into the table. For better performance, ! make sure (via begin/commit) that the changes are committed ! only once. ! call odbc_begin( db ) do read( lun, *, iostat=ierr ) station, date, salin, temp if ( ierr .ne. 0 ) exit call odbc_set_column( column(1), station ) call odbc_set_column( column(2), date ) call odbc_set_column( column(3), salin ) call odbc_set_column( column(4), temp ) call odbc_insert( db, 'measurements', column ) enddo close( lun ) call odbc_commit( db ) ! ! We want a simple report, the mean of salinity and temperature ! sorted by the station ! deallocate( column ) allocate( column(3) ) call odbc_column_query( column(1), 'station', ODBC_CHAR ) call odbc_column_query( column(2), name(3), ODBC_REAL, function='avg' ) call odbc_column_query( column(3), name(4), ODBC_REAL, function='avg' ) call odbc_prepare_select( db, 'measurements', column, stmt, & 'group by station order by station' ) write( *, '(3a20)' ) 'Station', 'Mean salinity', 'Mean temperature' do call odbc_next_row( stmt, column, finished ) if ( finished ) exit call odbc_get_column( column(1), station ) call odbc_get_column( column(2), salin ) call odbc_get_column( column(3), temp ) write( *, '(a20,2f20.3)' ) station, salin, temp enddo call odbc_close( db ) end program
The module is not complete yet:
There is no support for blobs or for character strings of arbitrary length. In fact the maximum string length is limited to 80 characters.
There is no support for NULL values or for DATE values.
The ODBC API is not completely covered, though the subset should be useful for many applications.
There are no makefiles that can help build the library yet. See the implementation notes below.
While the module is fairly straightforward Fortran 95 code, building a library out of it may not be straightforward due to the intricacies of C-Fortran interfacing.
This section aims to give a few guidelines:
The C code contains all the platform-dependent code, so that the Fortran code could remain clean.
To support more than one platform, the C code contains several macros:
FTNCALL - the calling convention for Fortran routines (important on Windows). It is automatically set to __stdcall when the macro "WIN32" has been defined (by the compiler or by specifying it on the command-line).
INBETWEEN - this macro controls whether the hidden arguments for passing the string length are put inbetween the arguments (if it is defined) or appended to the end (if it is not defined). Under Windows the Compaq Visual Fortran compiler used to use the first method, so this is automatically set. For other platforms the second method is more usual.
The naming convention (additional underscore, all capitals or all lowercase) has been handled in a simple-minded fashion. This should be improved too.
The library has been designed with 64-bits platforms in mind: it should run on these platforms without any difficulties.
The library has been tested on Linux, using the PostgreSQL database system with the psqlODBC driver.
It is unclear what the proper connection string should be, so that the type ODBC_POSTGRES for the routine odbc_open does not work yet. (PostgreSQL has a client/server architecture and can communicate over TCP/IP, so that more information may have to be specified than for file-based systems like SQLite and MS Access. Use the routine odbc_connect directly, so that you can pass a complete connection string.)
As of version 1.1 the library supports so-called binary large objects (column type: ODBC_BINARY). Not all database systems support them and they are actually an extension to the SQL language that underlies the communication to and from the database system. For this reason it may be necessary to use the routine odbc_set_blob_type to identify the database-specific method used for BLOBs. (It does not seem possible to identify this automatically.)
Copyright © 2012 Arjen Markus <arjenmarkus@sourceforge.net>