Programming C# 4.0 phần 8 doc

for as long as it needs to do a particular job—it has to be an illusion because if clients

really took it in turns, scalability would be severely limited. So transactions perform

the neat trick of letting work proceed in parallel except for when that would cause a

problem—as long as all the transactions currently in progress are working on independent data they can all proceed simultaneously, and clients have to wait their turn

only if they’re trying to use data already involved (directly, or indirectly) in some other

transaction in progress.‖

The classic example of the kind of problem transactions are designed to avoid is that

of updating the balance of a bank account. Consider what needs to happen to your

account when you withdraw money from an ATM—the bank will want to make sure

that your account is debited with the amount of money withdrawn. This will involve

subtracting that amount from the current balance, so there will be at least two operations: discovering the current balance, and then updating it to the new value. (Actually

it’ll be a whole lot more complex than that—there will be withdrawal limit checks,

fraud detection, audit trails, and more. But the simplified example is enough to illustrate

how transactions can be useful.) But what happens if some other transaction occurs at

the same time? Maybe you happen to be making a withdrawal at the same time as the

bank processes an electronic transfer of funds.

If that happens, a problem can arise. Suppose the ATM transaction and the electronic

transfer both read the current balance—perhaps they both discover a balance of $1,234.

Next, if the transfer is moving $1,000 from your account to somewhere else, it will write

back a new balance of $234—the original balance minus the amount just deducted.

But there’s the ATM transfer—suppose you withdraw $200. It will write back a new

balance of $1,034. You just withdrew $200 and paid $1,000 to another account, but

your account only has $200 less in it than before rather than $1,200—that’s great for

you, but your bank will be less happy. (In fact, your bank probably has all sorts of

checks and balances to try to minimize opportunities such as this for money to magically come into existence. So they’d probably notice such an error even if they weren’t

using transactions.) In fact, neither you nor your bank really wants this to happen, not

least because it’s easy enough to imagine similar examples where you lose money.

This problem of concurrent changes to shared data crops up in all sorts of forms. You

don’t even need to be modifying data to observe a problem: code that only ever reads

can still see weird results. For example, you might want to count your money, in which

case looking at the balances of all your accounts would be necessary—that’s a readonly operation. But what if some other code was in the middle of transferring money

between two of your accounts? Your read-only code could be messed up by other code

modifying the data.

‖ In fact, it gets a good deal cleverer than that. Databases go to some lengths to avoid making clients wait for

one another unless it’s absolutely necessary, and can sometimes manage this even when clients are accessing

the same data, particularly if they’re only reading the common data. Not all databases do this in the same

way, so consult your database documentation for further details.

Object Context | 577

A simple way to avoid this is to do one thing at a time—as long as each task completes

before the next begins, you’ll never see this sort of problem. But that turns out to be

impractical if you’re dealing with a large volume of work. And that’s why we have

transactions—they are designed to make it look like things are happening one task at

a time, but under the covers they allow tasks to proceed concurrently as long as they’re

working on unrelated information. So with transactions, the fact that some other bank

customer is in the process of performing a funds transfer will not stop you from using

an ATM. But if a transfer is taking place on one of your accounts at the same time that

you are trying to withdraw money, transactions would ensure that these two operations

take it in turns.

So code that uses transactions effectively gets exclusive access to whatever data it is

working with right now, without slowing down anything it’s not using. This means

you get the best of both worlds: you can write code as though it’s the only code running

right now, but you get good throughput.

How do we exploit transactions in C#? Example 14-20 shows the simplest approach:

if you create a TransactionScope object, the EF will automatically enlist any database

operations in the same transaction. The TransactionScope class is defined in the

System.Transactions namespace in the System.Transactions DLL (another class library

DLL for which we need to add a reference, as it’s not in the default set).

Example 14-20. TransactionScope

using (var dbContext = new AdventureWorksLT2008Entities())

{

using (var txScope = new TransactionScope())

{

var customersWithOrders = from cust in dbContext.Customers

where cust.SalesOrderHeaders.Count > 0

select cust;

foreach (var customer in customersWithOrders)

{

Console.WriteLine("Customer {0} has {1} orders",

customer.CustomerID, customer.SalesOrderHeaders.Count);

}

txScope.Complete();

}

For as long as the TransactionScope is active (i.e., until it is disposed at the end of the

using block), all the requests to the database this code makes will be part of the same

transaction, and so the results should be consistent—any other database client that

tries to modify the state we’re looking at will be made to wait (or we’ll be made to wait

for them) in order to guarantee consistency. The call to Complete at the end indicates

that we have finished all the work in the transaction, and are happy for it to commit—

without this, the transaction would be aborted at the end of the scope’s using block.

578 | Chapter 14: Databases

For a transaction that modifies data, failure to call Complete will lose any changes. Since

the transaction in Example 14-20 only reads data, this might not cause any visible

problems, but it’s difficult to be certain. If a TransactionScope was already active on

this thread (e.g., a function farther up the call stack started one) our Transaction

Scope could join in with the same transaction, at which point failure to call Complete

on our scope would end up aborting the whole thing, possibly losing data. The documentation recommends calling Complete for all transactions except those you want to

abort, so it’s a good practice always to call it.

Transaction Length

When transactions conflict because multiple clients want to use the same data, the

database may have no choice but to make one or more of the clients wait. This means

you should keep your transaction lifetimes as short as you possibly can—slow transactions can bog down the system. And once that starts happening, it becomes a bit of

a pile-up—the more transactions that are stuck waiting for something else to finish,

the more likely it is that new transactions will want to use data that’s already under

contention. The rosy “best of both worlds” picture painted earlier evaporates.

Worse, conflicts are sometimes irreconcilable—a database doesn’t know at the start of

a transaction what information will be used, and sometimes it can find itself in a place

where it cannot proceed without returning results that will look inconsistent, in which

case it’ll just fail with an error. (In other words, the clever tricks databases use to minimize how often transactions block sometimes backfire.) It’s easy enough to contrive

pathological code that does this on purpose, but you hope not to see it in a live system.

The shorter you make your transactions the less likely you are to see troublesome

conflicts.

You should never start a transaction and then wait for user input before finishing the

transaction—users have a habit of going to lunch mid-transaction. Transaction duration should be measured in milliseconds, not minutes.

TransactionScope represents an implicit transaction—any data access performed inside

its using block will automatically be enlisted on the transaction. That’s why Example 14-20 never appears to use the TransactionScope it creates—it’s enough for it to

exist. (The transaction system keeps track of which threads have active implicit transactions.) You can also work with transactions explicitly—the object context provides

a Connection property, which in turn offers explicit BeginTransaction and EnlistTran

saction methods. You can use these in advanced scenarios where you might need to

control database-specific aspects of the transaction that an implicit transaction cannot

reach.

Object Context | 579

These transaction models are not specific to the EF. You can use the

same techniques with ADO.NET v1-style data access code.

Besides enabling isolation of multiple concurrent operations, transactions provide another very useful property: atomicity. This means that the operations within a single

transaction succeed or fail as one: all succeed, or none of them succeed—a transaction

is indivisible in that it cannot complete partially. The database stores updates performed within a transaction provisionally until the transaction completes—if it succeeds, the updates are permanently committed, but if it fails, they are rolled back and

it’s as though the updates never occurred. The EF uses transactions automatically when

you call SaveChanges—if you have not supplied a transaction, it will create one just to

write the updates. (If you have supplied one, it’ll just use yours.) This means that

SaveChanges will always either succeed completely, or have no effect at all, whether or

not you provide a transaction.

Transactions are not the only way to solve problems of concurrent access to shared

data. They are bad at handling long-running operations. For example, consider a system

for booking seats on a plane or in a theater. End users want to see what seats are

available, and will then take some time—minutes probably—to decide what to do. It

would be a terrible idea to use a transaction to handle this sort of scenario, because

you’d effectively have to lock out all other users looking to book into the same flight

or show until the current user makes a decision. (It would have this effect because in

order to show available seats, the transaction would have had to inspect the state of

every seat, and could potentially change the state of any one of those seats. So all those

seats are, in effect, owned by that transaction until it’s done.)

Let’s just think that through. What if every person who flies on a particular flight takes

two minutes to make all the necessary decisions to complete his booking? (Hours of

queuing in airports and observing fellow passengers lead us to suspect that this is a

hopelessly optimistic estimate. If you know of an airline whose passengers are that

competent, please let us know—we’d like to spend less time queuing.) The Airbus A380

aircraft has FAA and EASA approval to carry 853 passengers, which suggests that even

with our uncommonly decisive passengers, that’s still a total of more than 28 hours of

decision making for each flight. That sounds like it could be a problem for a daily

flight.# So there’s no practical way of avoiding having to tell the odd passenger that,

sorry, in between showing him the seat map and choosing the seat, someone else got

in there first. In other words, we are going to have to accept that sometimes data will

#And yes, bookings for daily scheduled flights are filled up gradually over the course of a few months, so 28

hours per day is not necessarily a showstopper. Even so, forcing passengers to wait until nobody else is

choosing a seat would be problematic—you’d almost certainly find that your customers didn’t neatly space

out their usage of the system, and so you’d get times where people wanting to book would be unable to.

Airlines would almost certainly lose business the moment they told customers to come back later.

580 | Chapter 14: Databases

change under our feet, and that we just have to deal with it when it happens. This

requires a slightly different approach than transactions.

Optimistic Concurrency

Optimistic concurrency describes an approach to concurrency where instead of enforcing isolation, which is how transactions usually work, we just make the cheerful

assumption that nothing’s going to go wrong. And then, crucially, we verify that assumption just before making any changes.

In practice, it’s common to use a mixture of optimistic concurrency and

transactions. You might use optimistic approaches to handle longrunning logic, while using short-lived transactions to manage each individual step of the process.

For example, an airline booking system that shows a map of available seats in an aircraft

on a web page would make the optimistic assumption that the seat the user selects will

probably not be selected by any other user in between the moment at which the application showed the available seats and the point at which the user picks a seat. The

advantage of making this assumption is that there’s no need for the system to lock

anyone else out—any number of users can all be looking at the seat map at once, and

they can all take as long as they like.

Occasionally, multiple users will pick the same seat at around the same time. Most of

the time this won’t happen, but the occasional clash is inevitable. We just have to make

sure we notice. So when the user gets back to us and says that he wants seat 7K, the

application then has to go back to the database to see if that seat is in fact still free. If

it is, the application’s optimism has been vindicated, and the booking can proceed. If

not, we just have to apologize to the user (or chastise him for his slowness, depending

on the prevailing attitude to customer service in your organization), show him an updated seat map so that he can see which seats have been claimed while he was dithering,

and ask him to make a new choice. This will happen only a small fraction of the time,

and so it turns out to be a reasonable solution to the problem—certainly better than a

system that is incapable of taking enough bookings to fill the plane in the time available.

Sometimes optimistic concurrency is implemented in an application-specific way. The

example just described relies on an understanding of what the various entities involved

mean, and would require us to write code that explicitly performs the check described.

But slightly more general solutions are available—they are typically less efficient, but

they can require less code. The EF offers some of these ignorant-but-effective approaches to optimistic concurrency.

The default EF behavior seems, at a first glance, to be ignorant and broken—not only

does it optimistically assume that nothing will go wrong, but it doesn’t even do anything

to check that assumption. We might call this blind optimism—we don’t even get to

Object Context | 581

discover when our optimism turned out to be unfounded. While that sounds bad, it’s

actually the right thing to do if you’re using transactions—transactions enforce isolation and so additional checks would be a waste of time. But if you’re not using transactions, this default behavior is not good enough for code that wants to change or add

data—you’ll risk compromising the integrity of your application’s state.

To get the EF to check that updates are likely to be sound, you can tell it to check that

certain entity properties have not changed since the entity was populated from the

database. For example, in the SalesOrderDetail entity, if you select the ModifiedDate

property in the EDM designer, you could go to the Properties panel and set its Concurrency Mode to Fixed (its default being None). This will cause the EF to check that

this particular column’s value is the same as it was when the entity was fetched whenever you update it. And as long as all the code that modifies this particular table remembers to update the ModifiedDate, you’ll be able to detect when things have changed.

While this example illustrates the concept, it’s not entirely robust. Using

a date and time to track when a row changes has a couple of problems.

First, different computers in the system are likely to have slight differences between their clocks, which can lead to anomalies. And even if

only one computer ever accesses the database, its clock may be adjusted

from time to time. You’d end up wanting to customize the SQL code

used for updates so that everything uses the database server’s clock for

consistency. Such customizations are possible, but they are beyond the

scope of this book. And even that might not be enough—if the row is

updated often, it’s possible that two updates might have the same timestamp due to insufficient precision. A stricter approach based on GUIDs

or sequential row version numbers is more robust. But this is the realm

of database design, rather than Entity Framework usage—ultimately

you’re going to be stuck with whatever your DBA gives you.

If any of the columns with a Concurrency Mode of Fixed change between reading an

entity’s value and attempting to update it, the EF will detect this when you call

SaveChanges and will throw an OptimisticConcurrencyException, instead of completing

the update.

The EF detects changes by making the SQL UPDATE conditional—its

WHERE clause will include checks for all of the Fixed columns. It inspects

the updated row count that comes back from the database to see

whether the update succeeded.

How you deal with an optimistic concurrency failure is up to your application—you

might simply be able to retry the work, or you may have to get the user involved. It will

depend on the nature of the data you’re trying to update.

582 | Chapter 14: Databases

The object context provides a Refresh method that you can call to bring entities back

into sync with the current state of the rows they represent in the database. You could

call this after catching an OptimisticConcurrencyException as the first step in your code

that recovers from a problem. (You’re not actually required to wait until you get a

concurrency exception—you’re free to call Refresh at any time.) The first argument to

Refresh tells it what you’d like to happen if the database and entity are out of sync.

Passing RefreshMode.StoreWins tells the EF that you want the entity to reflect what’s

currently in the database, even if that means discarding updates previously made in

memory to the entity. Or you can pass RefreshMode.ClientWins, in which case any

changes in the entity remain present in memory. The changes will not be written back

to the database until you next call SaveChanges. So the significance of calling Refresh

in ClientWins mode is that you have, in effect, acknowledged changes to the underlying

database—if changes in the database were previously causing SaveChanges to throw an

OptimisticConcurrencyException, calling SaveChanges again after the Refresh will not

throw again (unless the database changes again in between the call to Refresh and the

second SaveChanges).

Context and Entity Lifetime

If you ask the context object for the same entity twice, it will return you the same object

both times—it remembers the identity of the entities it has returned. Even if you use

different queries, it will not attempt to load fresh data for any entities already loaded

unless you explicitly pass them to the Refresh method.

Executing the same LINQ query multiple times against the same context

will still result in multiple queries being sent to the database. Those

queries will typically return all the current data for the relevant entity.

But the EF will look at primary keys in the query results, and if they

correspond to entities it has already loaded, it just returns those existing

entities and won’t notice if their values in the database have changed.

It looks for changes only when you call either SaveChanges or Refresh.

This raises the question of how long you should keep an object context around. The

more entities you ask it for, the more objects it’ll hang on to. Even when your code has

finished using a particular entity object, the .NET Framework’s garbage collector won’t

be able to reclaim the memory it uses for as long as the object context remains alive,

because the object context keeps hold of the entity in case it needs to return it again in

a later query.

The way to get the object context to let go of everything is to call

Dispose. This is why all of the examples that show the creation of an

object context do so in a using statement.

Object Context | 583

There are other lifetime issues to bear in mind. In some situations, an object context

may hold database connections open. And also, if you have a long-lived object context,

you may need to add calls to Refresh to ensure that you have fresh data, which you

wouldn’t have to do with a newly created object context. So all the signs suggest that

you don’t want to keep the object context around for too long.

How long is too long? In a web application, if you create an object context while handling a request (e.g., for a particular page) you would normally want to Dispose it before

the end of that request—keeping an object context alive across multiple requests is

typically a bad idea. In a Windows application (WPF or Windows Forms), it might

make sense to keep an object context alive a little longer, because you might want to

keep entities around while a form for editing the data in them is open. (If you want to

apply updates, you normally use the same object context you used when fetching the

entities in the first place, although it’s possible to detach an entity from one context

and attach it later to a different one.) In general, though, a good rule of thumb is to

keep the object context alive for no longer than is necessary.

WCF Data Services

The last data access feature we’ll look at is slightly different from the rest. So far, we’ve

seen how to write code that uses data in a program that can connect directly to a

database. But WCF Data Services lets you present data over HTTP, making data access

possible from code in some scenarios where direct connections are not possible. It

defines a URI structure for identifying the data you’d like to access, and the data itself

can be represented in either JSON or the XML-based Atom Publishing Protocol

(AtomPub).

As the use of URIs, JSON, and XML suggests, WCF Data Services can be useful in web

applications. Silverlight cannot access databases directly, but it can consume data via

WCF Data Services. And the JSON support means that it’s also relatively straightforward for script-based web user interfaces to use.

WCF Data Services is designed to work in conjunction with the Entity Framework.

You don’t just present an entire database over HTTP—that would be a security liability.

Instead, you define an Entity Data Model, and you can then configure which entity

types should be accessible over HTTP, and whether they are read-only or support other

operations such as updates, inserts, or deletes. And you can add code to implement

further restrictions based on authentication and whatever security policy you require.

(Of course, this still gives you plenty of scope for creating a security liability. You need

to think carefully about exactly what information you want to expose.)

To show WCF Data Services in action, we’ll need a web application, because it’s an

HTTP-based technology. If you create a new project in Visual Studio, you’ll see a Visual

C#→Web category on the left, and the Empty ASP.NET Web Application template will

suit our needs here. We need an Entity Data Model to define what information we’d

584 | Chapter 14: Databases

Thư viện tri thức trực tuyến

Programming C# 4.0 phần 8 doc

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

Programming C# 4.0 phần 5 docx

Programming C# 4.0 phần 1 pdf

Programming C# 4.0 phần 4 pdf

Programming C# 4.0 phần 6 doc

Programming C# 4.0 phần 7 pdf

Programming C# 4.0 phần 2 ppt