INFO 491.NET

.NET Overview - ADO.NET


Connecting To Your Data

Without the ability to store data and retrieve data not held locally in memory our programs would be fairly useless. Searching, updating, and processing data is one of the fundamental tasks of software. The contemporary approach is to use a Relational Database Management System such as:

  • dBase
  • DB2
  • Informix
  • Oracle
  • MySQL
  • PostgreSQL
  • MS SQLServer
  • MS Access

In order to store and retrieve data that your program will need. This is not to say that using normal "flat" databases (read: text files) is totally outmoded and altogether appropriate, but using an RDMBS is standard and advised when a project of even moderate size is undertaken. If you use an RDMBS, you have the highest likelihood of your program being portable and scalable.

What a developer needs is a mechanism that allows for programmatic access to data held within an RDBMS. In the .NET Framework, we use an updated DNA solution (ADO) called ADO.NET, which stands for ActiveX Data Objects.

Like everything else in the .NET Framework class library, ADO.NET is nothing more than a group of types, all of which reside in the System.Data namespace. These types are used by client applications that need to work with stored data, and they allow access to that data in various useful ways. The most common need is to work with relational data, in Oracle or SQL Server, and is largely what ADO.NET focuses on.  ADO.NET's also adds support for working with XML documents.

Data Providers

Applications utilizing ADO.NET rely on .NET Data providers which enable access to a particular data source. Programs you write, which implement ADO.NET, act at a the client. The client, written as managed code running on the CLR, can use various .NET data providers to access stored data. Each .NET data provider is also written as managed code, but a data provider might also rely on software that doesn't use the .NET Framework to accomplish its task.

The .NET Framework class library includes two .NET data providers:

  • SQL provider: Written completely as managed code, this .NET data provider can be used to access Microsoft's SQL Server DBMS 7.0 or later. Access to earlier versions of SQL Server requires using another provider option.
     
  • OLE DB provider: Traditional ADO applications rely on OLE DB providers for access to data, so several of these providers exist. The OLE DB .NET data provider implements a wrapper around an OLE DB provider, such as those for Oracle or older versions of SQL Server, that lets it be used by .NET Framework applications.

There is also a third option, called the ODBC provider. The Open Database Connectivity (ODBC) API is another of the many data access interfaces Microsoft has provided for Windows developers, and applications that use it rely on ODBC drivers to get at stored data. ODBC is even older than ADO, so many, many ODBC drivers exist. The ODBC .NET data provider provides a wrapper around an ODBC driver, allowing managed code to use it for accessing data. Also, it's worth pointing out that .NET Framework applications can still use the traditional ADO interface via COM interoperability support.

ADO.NET Types and Functionality

Each .NET data provider is implemented as a group of types (from the FCL). The primary types that implement the SQL provider are located in the System.Data.SqlClient namespace, while those that implement the OLE DB provider are in System.Data.OleDb. Regardless of which provider you choose, each one offers an analogous set of classes. The fundamental objects supported by any .NET data provider are as follows:

  • Connection: Allows establishing and releasing connections and can also be used to begin transactions.
     
  • Command: Allows storing and executing a command, such as a SQL query or stored procedure, and specifying parameters for that command.
     
  • DataReader: Provides direct, sequential, read-only access to data in a database.
     
  • DataAdapter: Built on a DataReader, this class creates and populates instances of the class DataSet. As described later in this chapter, DataSets allow more flexible access to data than is possible using just a DataReader.

Each of these classes are actually generic rather than actual class names defined in the .NET Framework class library (DataSet being an exception). The DataReader for the SQL .NET data provider, for example, is implemented by the SqlDataReader class, while that for the OLE DB .NET data provider is implemented by the OleDbDataReader class.

Using these generic objects, a .NET data provider gives clients two options for accessing data. Both approaches use Connections and Commands to interact with a DBMS, but they differ primarily in how a client can work with the result of a query. A client that needs only straightforward one-row-at-a-time access to a query's results can use a DataReader object to read those results. Clients with more complex requirements, such as accessing a query's result in an arbitrary order, filtering that result, combining the results of multiple queries, or sending those results across a network, can use a DataAdapter object to retrieve data wrapped in a DataSet.

Accessing Data Directly

Make The Connection

All actions within the ADO.NET data access framework share some common steps regardless of what approach is used to access data. An ADO.NET client application first relies on a connection to the DBMS. The application can explicitly open a connection by calling the Open method of a Connection object. To open a connection to SQL Server, for example, the client invokes this method on an instance of the SqlConnection class. To open a connection to some other database using the OLE DB .NET data provider, the client invokes the Open method on an instance of the OleDbConnection class. In either case, the client must first set the class's ConnectionString property, indicating which database it's interested in and other information.

Make The Command

Once a connection exists, a client can issue queries and other commands on it using a Command object. An application can create a Command object by invoking a Connection object's CreateCommand method. Once again, different data providers use different Command object classes: The SQL provider uses SqlCommand, while the OLE DB provider uses OleDbCommand. Whichever is used, both allow specifying a SQL query for a Command object by setting the object's CommandText property.

Execute The Command

Once a Command object exists, a client can choose one of several methods this object provides to execute the command it contains. Those methods are as follows:

  • ExecuteReader: Returns a DataReader that can be used to read the results of the query. A DataReader can access the result of a SQL query one row at a time.
     
  • ExecuteScalar: Returns a single value, such as the result from a SQL SUM function. If the result of the query contains more than one value, this method will return the value in the first column of the first row—everything else will be ignored. The value it returns is of the type System.Object, which means that it can contain a result of any type.
     
  • ExecuteNonQuery: Returns no data, but instead sends back the number of rows affected by the query. This method is used with commands that don't return results, such as SQL UPDATEs, INSERTs, and DELETEs.

Free Resources

When your program is finished, your ADO.NET client must invoke either the Close() or the Dispose() method on the Connection object. This is one case where you can't rely on garbage collection to shut down the open connection. Each connection must be explicitly closed.

Direct Access Using DataReader

Direct access to data relies on DataReader objects. DataReaders are fast, and they don't use much memory, because only one row of data at a time is made accessible (although more may be cached). An application using a DataReader can read a query's results only one row at a time, and can move forward only through those results—no random access is allowed.

To read a row, the application calls a DataReader's Read() method, which makes the next row from the result of the executed query accessible. This method returns FALSE when there are no more rows to be read from the result. Once this has been done, the contents of that row can be accessed in various ways. If you know the types of the columns in the result, the values from the current row can be read by calling the appropriate accessor methods provided by the DataReader. For example, if a SQL query asks for a list of all employee names and ages, each row in that query's result will contain a string and an integer. To read these, a client application could use the DataReader's GetString() and GetInt32() methods, respectively. The SqlDataReader and OleDbDataReader classes provide many more Get methods, each capable of reading a particular type of data in a row. It's also possible to access each column of the current row by the column name or position, but using the Get methods is more efficient.

using System.Data.SqlClient;
class DataReaderExample
{
    public static void Main()
    {
        SqlConnection Cn = new SqlConnection(
            "Data Source=localhost;" +
            "Integrated Security=SSPI;" +
            "Initial Catalog=example");
        SqlCommand Cmd = Cn.CreateCommand();
        Cmd.CommandText =
            "SELECT Name, Age FROM Employees";
        Cn.Open();
        SqlDataReader Rdr = Cmd.ExecuteReader();
        while (Rdr.Read())
        {
            System.Console.WriteLine(
                "Name: {0} , Age: {1} ",
                Rdr.GetString(0),
                Rdr.GetInt32(1));
        }
        Rdr.Close();
        Cn.Close();
    }
}

The example above gives a feel for how this whole process works.

Retrieving Data in Sets

A DataSet instance is an in-memory cache for data. Most commonly, that data is the set of rows resulting from one or more SQL queries. Unlike Connection, Command, and DataReader, each of which has a unique class for each .NET data provider, there is only one DataSet class used by all .NET data providers.

DataSets are very general things, so they're useful in many different situations. For instance, if a client application wishes to examine data in an arbitrary way, such as scrolling back and forth through the rows resulting from a SQL query, a a DataSet is the best tool for the job. DataSets are also useful for combining data from different data sources, such as two separate queries, or different types of data, such as the result of a query on a relational DBMS and the contents of an XML document. DataSets are also useful if data needs to be sent across a network, because DataSets are serializable. A DataSet can also be passed around and used independently—it's just a managed object with state and methods like any other—but it cannot itself maintain an open connection to any underlying database.

Each DataSet can contain zero or more DataTable objects, which are instances of a class defined in System.Data. Each DataTable can contain the result of some query or perhaps something else. A DataSet can also maintain relationships among DataTables using DataRelation objects. For example, a column in one table might contain a foreign key for another table, a relationship that could be modeled by a DataRelation. These relationships can also be used to navigate through the contents of a DataSet. A simple DataSet, one that contains the result of just a single SQL query, might have only one DataTable and no DataRelations. DataSets can be quite complex, however, so an application that needed to maintain the results of several queries in memory could stuff them all into one DataSet, each in its own DataTable.

Data Access with DataSets

DataSet objects can be created using a .NET data provider's DataAdapter object. The SQL .NET data provider uses the class SqlDataAdapter, while the OLE DB .NET data provider uses OleDbDataAdapter. Both classes provide a similar set of properties and methods for creating and working with DataSets. Some important properties of a DataAdapter are:

SelectCommand: Contains a Command object that can be used to populate a DataTable within some DataSet with the results of a query on some database. The Command typically references a SQL SELECT statement.

InsertCommand: Contains a Command object used to insert rows added to a DataTable into an underlying database. The Command typically references a SQL INSERT statement.

UpdateCommand: Contains a Command object used to update a database based on changes made to a DataTable. The Command typically references a SQL UPDATE statement.

DeleteCommand: Contains a Command object used to delete rows in a database based on deletions made to a DataTable. The Command typically references a SQL DELETE statement.

The contents of these properties, which can be explicit SQL statements or stored procedures, are accessed by various methods provided by the DataAdapter. The most important of these methods is Fill(). The Fill() method executes the command in the DataAdapter's SelectCommand property and then places the results in a DataTable object inside whatever DataSet is passed as a parameter on the Fill() call. The connection to the database can be closed once the desired results are returned, since a DataSet object is completely independent of any connection to any data source.

This image generally describes the steps needed to populate a dataset.  This can also be seen in the following code snippet:

using System.Data;
using System.Data.SqlClient;

class DataSetExample
{
    public static void Main()
    {
        SqlConnection Cn = new SqlConnection(
            "Data Source=localhost;" +
            "Integrated Security=SSPI;" +
            "Initial Catalog=example");
        SqlCommand Cmd = Cn.CreateCommand();
        Cmd.CommandText =
            "SELECT Name, Age FROM Employees";
        SqlDataAdapter Da = new SqlDataAdapter();
        Da.SelectCommand = Cmd;
        DataSet Ds = new DataSet();
        Cn.Open();
        Da.Fill(Ds, "NamesAndAges");
        Cn.Close();
    }
}

The DataSet created the preceding fictitious example contains just one DataTable. It is also possible to populate a DataSet from other databases and data sources. To add other DataTables with different contents to this same DataSet, the Fill method could be called again once the SelectCommand property of the DataAdapter had been changed to contain a different query. Alternatively, another DataAdapter could be used on a different data source.

Updating Via DataSet

It's also possible to update/modify information in a DBMS using a DataSet. Once one or more DataTables have been created in the DataSet, the information in those tables can be changed by the client. DataAdapter's Update method will cause the DataAdapter to examine the changes in any DataSet or DataTable passed into this call and then modify the underlying database to reflect those changes. These modifications are made using the commands stored in the DataAdapter's InsertCommand, UpdateCommand, and DeleteCommand properties described earlier. Because these commands can be stored procedures as well as dynamically executed statements, these update operations can be reasonably efficient.

What if the database changes while you are working?

There is no way for the DataSet itself to maintain a lock on the underlying data from which its contents were derived. As a result, it's possible that the data in the database has changed since some of its information was copied into the DataSet. Whether updates succeed if the underlying data has changed depends on what commands are placed into the InsertCommand, UpdateCommand, and DeleteCommand properties of the DataAdapter. If Inserts, Updates and Deletes are done automatically using a CommandBuilder object, then automatically generated update commands will fail if any of the affected data has changed since it was originally read.

Modifying the DataSet

The data within a DataSet is grouped into one or more DataTables such that your program can access and modify this data. Applications need to work with data in diverse ways, so DataTables provide several options for accessing and modifying the data they contain. Two classes are commonly used to work with information in a DataTable:

  • DataRow: Represents a row in a DataTable. Each DataTable has a Rows property that allows access to a collection containing all of that table's DataRows. The data in a DataTable can be accessed by examining its DataRows. In fact, perhaps the most common way to access a DataTable's information is by directly accessing the Rows property as a two-dimensional array.
     
  • DataColumn: Represents a column in a DataTable. Each DataTable has a Collections property that allows access to a collection containing all of that table's DataColumns. Rather than defining the table's data, however, this collection defines what columns the table has and what type each column is. In other words, this collection defines the table's schema.

Sub Select

Each DataTable provides a Select() method which allows for selections within subsets. This method allows your application to describe the data it's interested in and then have that data returned in an array of DataRow objects. Select() takes up to three parameters: a filter, a sort order, and a row state. The filter allows specifying a variety of criteria that selected rows must meet, such as "Name='Smith'" or "Age > 45." The sort order allows specifying which column the results should be sorted by and whether those results should be sorted in ascending or descending order. Finally, the row state parameter allows returning only records in a specific state, such as those that have been added or those that have not been changed. Whatever criteria are used on the call to Select(), the result is an array of DataRows. The application can then access individual elements of this array as needed to work with the data each one contains.

Add New Records

To add data to a DataTable, an application can create a new DataRow object, populate it, and insert it into the table. To create the new DataRow, the application can call a DataTable's NewRow() method. Once this DataRow exists, the application can assign values to the fields in the row, then add the row to the table by calling the Add method provided by the DataTable's Rows collection. It's also possible to modify the contents of a DataTable directly by assigning new values to the contents of its DataRows and to delete rows using the Remove() method of the Rows collection.

Tracking Changes

The state of a DataRow changes as that DataRow is modified. The current state is kept in a property of the DataRow called RowState, and several possibilities are defined. Changes made to a DataRow can be made permanent by calling the DataRow.AcceptChanges() method for that row, which sets the DataRow's RowState property to Unchanged. To accept all changes made to all DataRows in a DataTable at once, an application can call DataTable.AcceptChanges(), while to accept all changes made to all DataRows in all DataTables in a DataSet at once, an application can call DataSet.AcceptChanges(). It's also possible to call the RejectChanges() method at each of these levels to roll back any modifications that have been made.