Peter Bromberg posts some good rants on his unblog. His recent entry “Is Your Development Process Broken?” was timely.
Several months ago, a friend of a friend called me about a web app with a performance problem. I was going to turn the gig down because the app was written in C++ as an ISAPI extension. I haven’t touched either C++ or ISAPI for years and have no desire to relive those days*.
As it turns out, the performance problem was relatively easy to solve. SQL Profiler revealed a single page that could generate over 500,000 roundtrips to the database with a single button click. The code was issuing database commands inside of a nested loop.
It only took three months to get an update into production.
Three months?
Yeah. Three months.
The application was developed by an “IT Solutions Provider” with no source control. It was difficult to get the application to build. It was difficult to get a build into test that wasn’t broken. Nobody knew what they were testing, or what was changing. Total chaos. Just getting the process into shape was a major ordeal.
It is unthinkable that the company responsible for the mess markets the services of ‘experienced software professionals’. For professionals to bill a customer for this kind of service is nothing short of malpractice.
* It's worth noting that a search for “ISAPI” on Microsoft.com yields more security bulletins than articles on development. C++ is a double-edged knife with a pointy tip.
To prevent two pages from modifying in-process Session variables at the same time, the ASP.NET runtime uses a lock. When a request arrives for a page that reads and writes Session variables, the runtime acquires a writer lock. The writer lock will block other pages in the same Session who might write to the same session variables.
It’s easy to see the runtime implications of the lock when using two pages in a frameset. Here is a page that executes quickly:
Here is a page that executes slowly…
We can take these two pages and put them both inside of a frameset so they appear in the same window.
If we click the button on the slow page, then quickly click the button on the quick page, we’ll see the quick page doesn’t finish processing until the slow page ends. The slow page has to release a lock before the quick page can begin processing.
If this behavior causes a problem, one solution is to use the EnableSessionState attribute in the @ Page directive and tell the runtime how the page intends to use Session variables. If the Page doesn’t need Session state set EnableSessionState=”false” and avoid locking altogether. If the Page only reads Session variables use EnableSessionState=”ReadOnly” (although note that the runtime doesn’t throw an exception if the page actually does write to a Session variable – the write seems to happen just fine).
I love the little challenges Ayende Rahien puts on his blog. The last SQL challenge was to convert a table like this:
FromDate | ToDate |
01/01/2000 | 01/31/2000 |
02/01/2000 | 02/28/2000 |
03/05/2000 | 03/31/2000 |
Into this:
FromDate | ToDate |
01/01/2000 | 02/28/2000 |
03/05/2000 | 03/31/2000 |
Where adjacent date ranges collapse to a single record. The first step was to setup some experimental tables.
I have to think about SQL in small chunks, so I first thought about how to get a resultset with a FromDate minus one day. That would give me a column to compare against another record's ToDate:
Typically these kind of solutions involve a self join, so I added the above into the following query:
Which generates the following resultset:
FromDate | ToDate | FromDate | ToDate |
2000-01-01 | 2000-01-31 | 2000-01-31 | 2000-02-28 |
2000-02-01 | 2000-02-28 | NULL | NULL |
2000-03-05 | 2000-03-31 | NULL | NULL |
Looking at that resultset it was easy to see that coalescing the ToDate into the NULL value left behind by a LEFT JOIN and using a GROUP BY would process the sample data.
Then Ayende had to up the ante by updating the post with a harder challenge. This one has me scratching my head. I ended up with the above SELECT query inside a SQL 2005 common table expression. I’m sure someone will come along and tell me the following query is horribly inefficient, and terribly wrong:
The query plan is an ugly beast (but not the ugliest). I think I need to step away from this problem. There has to be a more elegant solution.
Greek philosophers Leucippos and Democritus were among the first to postulate that all matter is composed of indivisible units they called atomos. Thus arose atomistic philosophy and the concept of an atom. (Note that the Hindu Vaisesika Sutra appears to pre-date the Greek thinking by 100 years or more). Curious humans continued to tinker over the next 2300 years until they got inside the atom. Nuclear technology is responsible for devices as useful as the x-ray machine, and devices as destructive as the television and atom bomb.
Somewhere along the line, computer science adopted the term “atomic operation” to describe an instruction that is indivisible and uninterruptible by other threads of execution. It appears the now 40-year-old IBM System/360 was the first architecture to include an atomic operation (TS – test and set).
Let’s look at an atomic operation from a different perspective: other threads and CPUs cannot observe an atomic operation “in progress”. If thread A is writing a 32-bit value to memory as an atomic operation, thread B will never be able to read the memory location and see only the first 16 of 32 bits written out. Thread B can only read the value that exists before the atomic operation began, or the value that exists sometime after the atomic operation completes, but can never read the value only partially written.
Why is important to know what operations are atomic? If you ever come across code like the following, you’ll be able to know there is a problem.
If multiple threads call IncrementCounter concurrently, we might not get an accurate counter reading. The ++ operator isn't atomic - it needs to read the value, then write back an updated value. Another thread can step in between those two operations and we will miss a hit if both threads read the same value or overwrite each other's results.
What is an atomic operation in the CLR? Partition I of the CLI specification states that:
"A conforming CLI shall guarantee that read and write access to properly aligned memory locations no larger than the native word size (the size of type native int) is atomic…”.
Section 12.5 of the C# specification has the specifics for semicolon fans:
“Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types.” Also: “…there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.”.
To make the increment operation atomic, use Interlocked.Increment, or a lock.
… even the code that doesn’t work.
I use SourceGear Vault for personal work and experiments. SourceGear is free for a single user (see the last question in the Vault FAQ).
When an experiment doesn’t work out I have a tendency to delete and purge the project from source control. I don’t want my repository cluttered with broken junk. Invariably, 3 months later I’ll run into a scenario I’ve tried before, but I don’t have the code because my first attempt didn’t work. I’m always wishing I had the old code to prove that an approach doesn’t work. Even worse is when I’ve learned something new that might have gotten the old code working.
Now I’m a code packrat. I stuff broken projects into basement shoeboxes hoping the projects will be worth something someday.
In a nutshell:
“F# is a programming language that provides the much sought-after combination of type safety, performance and scripting, with all the advantages of running on a high-quality, well-supported modern runtime system.”
The well-supported modern runtime would be the CLR.
Congrats to optionsScalper and gang for a successful launch of The Hub - a community site and THE place for F#.