Spectrum Online—Tomorrows Technology Today
Font Size: A A A

Main

Software Archives

June 3, 2007

What and Who Are We?

Here at the Risk Factor blog, our focus is going to be on the risks and rewards of information systems and technology, or more broadly put, the social implications of IS&T. As moderator of this blog, my hope is that we can hold a conversation about what IS&T works and what doesn’t, what past, present and future IS&T trends portend, and, of course, why.

Joining me – Bob Charette – from time to time will be a number of guest bloggers from academia, industry and government who are involved in some of the more important IS&T risk and reward issues of the day. Joining me in this little endeavor are some pretty interesting folks.

There is Peter Ladkin, a Professor of Computer Networks and Distributed Systems in the Faculty of Technology at the University of Bielefeld. Peter specializes in the analysis of safety-related and safety-critical complex heterogeneous systems and their behavior, including accidents.

Next we have Phil Neches, who is one of America's leading technologists and a true database expert, among other things. Phil was Founder, Chief Scientist, and Vice President, of Teradata Corp, and is heavily involved in venture capital investment.

Then there is Peter Neumann, a senior scientist at the SRI International Computer Science Laboratory. Peter, who is the moderator of the ACM Risk Forum newsgroup, has been looking and discussing IS&T risks since nearly the inception of modern computing, is in my opinion the most thoughtful and insightful commentator on the subject.

There is also Martyn Thomas, an expert in large, real-time, safety-critical, software intensive systems. Martyn was the Founder of Praxis, the internationally recognized leader in the use of rigorous software engineering, including mathematically formal methods, is a visiting professor at Oxford University, and is the first person to receive Commander of the British Empire (CBE) award for “services to software engineering.”

Also joining us is John Stone, a Strategy Executive at the consulting firm Monroe Partners. John has worked in and written on all aspects of IS&T across a wide-variety of industries, and brings a wealth of practical knowledge and experience in what it takes to create successful large-scale IS&T projects and programs.

Finally, there is Ed Yourdan, a recognized expert witness and computer consultant who specializes in project management, software engineering methodologies, and Web 2.0 development. For the couple of you who don’t recognize the name, Ed is one of the most influential voices and keen observers of what is happening in the IS&T industry.

I think you’ll agree, the folks above provide a pretty good initial set of eyes on the risks & rewards that IS&T create. Over time, II will be asking more guest bloggers involved in different parts of the IS&T field to join us to continue to enrich the conversation.

Machine Readable Information

An article that caught my eye a few weeks back was the announced acquisition of the Reuters Group by the Thomson Corporation for over $17 billion. The combined companies would create the largest financial news provider.

More interestingly to me than the acquisition itself is the potential impact on future stock market trading. About one-third of stock market trading is currently performed through program or automatic trading. During the week of 14 – 18 May, for example, the New York Stock Exchange reported that “program trading amounted to 35.3 percent average of NYSE daily volume of 3,233.2 million shares, or 1,142.9 million program shares traded per day. This included program trading associated with the May 18 monthly expiration of stock-index options and futures.”

Program trading is inherently “backward looking” in the sense that the trades are automatically made based on price fluctuations that meet certain criteria. The focus in recent years has been on increasing the speed of such trades.

However, both Reuters and Thomson have been working on what is generally called machine readable news, for instance, a “Reuters system will 'read' news articles and score how positive or negative they are. The system will enable customers to analyse news across thousands of companies, far more quickly than can be done by humans. This will enable trading machines to react to market moving news in milliseconds.” Not only are current news stories being made “machine readable,” but Reuters is making its archives machine readable as well.

The Financial Times reports that Thomson has developed software that can automatically “generate the stories work so fast an earnings story can be turned around within 0.3 seconds of a company making results public.” In addition, as noted in the FT story, program trading, “… is set to rise much further in the coming years as fund managers, along with brokers and exchanges, strive for ever-greater speed and control over the trading cycle amid heightened market competition and consolidation.”

The combination of incredibly fast automatic news generation along with historical data to create predictive market responses to such news may create some interesting program market trading impacts. It will be interesting to see, as machine readable news becomes more available, whether the market becomes more volatile as a result, or whether dangerous feed-forward loops are produced during boom times, or more likely, individuals or governments will make use of this capability to deliberately hoax financial markets for either personal or strategic gain.

A government run news agency, for instance, could find it in its self-interest to plant a financial story, say involving some scarce resource – say petroleum – which could cause a panic in the market. By studying the conditions that caused market panics in the past, it might turn into a potential non-military but very effective weapon. Maybe governments (and the exchanges) may want to start thinking about how financial companies could use all this information for not only creating financial rewards, but how others could manipulate it to create major financial risks.

June 6, 2007

Whose Risk?

A nice little controversy concerning risk and IT systems has been brewing in the UK. As first reported by ComputerWeekly, government officials are ordering the destruction of what are called Gateway review reports. A Gateway review is “a ‘peer review’ in which independent practitioners from outside the programme/project use their experience and expertise to examine the progress and likelihood of successful delivery of the programme or project. They are used to provide a valuable additional perspective on the issues facing the internal team, and an external challenge to the robustness of plans and processes.” There are several “gateways” an individual UK government IT project is supposed to pass during its life, starting with Gateway 1 (Business Justification) to Gateway 5 (Operations Review & Benefits Realisation).

The reviews are meant for internal project consumption only, but there has been a long-standing demand by newspapers like ComputerWeekly and government critics to make the results of these reviews public. The Gateway reviews of two major UK IT projects in particular – the National Health Service electronic medical record project National Programme for IT (NPfIT) and the National Identity Scheme’s Identify Cards Programme – both of which are highly controversial, costly, and in trouble.

Supporting ComputerWeekly’s bid to have the Gateway Reports made public has been a ruling by the UK government’s Information Tribunal, an organization that hears appeals regarding whether government information should be publicly released or not, stating that the public interest trumps the desire of the government agencies to keep the reviews private. The UK Parliament’s Public Accounts Committee (PAC) also supported their disclosure.

However, the government – through the Office of Government Commerce (OGC) – which oversees the Gateway review process, insists that making these reports public would fundamentally undermine their use. The OGC claims that IT program management would not get open and honest appraisals of their programs if the people involved knew that there private opinions would be made public.

I can sympathize with that view. Having conducted hundreds of risk assessments over my career and many high profile government ones at that, there is something to be said for confidentiality. I promise confidentiality to programs as a matter of policy myself. Public disclosure will put people on their guard, and the tendency is for you to get optimistic, rather than realistic, estimates of the state of the project’s problems and risks.

When I was involved in the US DoD Tri-Service Assessment Initiative (TAI), program managers were the sole owners of the assessment reports. They could disclose them as them pleased. Our advice to program managers was they should disclose the reports as widely as possible, since for the most part, many of the probelms and risks they faced were created by events and situations outside of their control, and which they needed outside help to address. What we did do, however, was to take the results of every project assessment, sanitize the results, and conduct analysis on the aggregate to try to discover systemic issues that were plaguing most DoD programs.

On the other hand, the public does have a right to know of the technical, financial, and social risks being taken in their name. Both NPfIT and the Identity Card programs will affect every person in the UK, and both not only have seen major cost increases, but there are major issues of privacy protection involved.

Also undercutting the OGC’s arguments somewhat is that many IT projects ignore the results of the Gateway reviews, including some that should never have been initiated or should have been cancelled more than once. Further, a report yesterday by the PAC on Delivering Successful IT-enabled Business Change states that many senior managers responsible for major IT programs are inexperienced, don’t pay much attention to the programs they are responsible for, and don’t seem to care much about the Gateway review or other risk reviews of their programs.

Also, one can’t help wondering whether the real reason that the OGC is so adamant about not wanting to make Gateway review reports public is plain, old embarrassment. As the US FBI found out with its Virtual Case File (VCF) project, not taking the warnings of outside reviewers seriously can end up making you a poster child of poor judgment, an eternal business case study, and also a laughing stock to all your peers.

It will be interesting to watch how the little rhubarb in the UK ends up. But it does raise a set of questions about the public’s right to know about the risks posed by large, government IT projects. How much should be disclosed? How does a program or project manager get honest opinions on the state of their project if everything can be disclosed? And don’t most government program managers have too many backseat drivers and second guessers in trail already?

June 10, 2007

A System Burp

There were news reports that an air traffic control computer failure in Atlanta on Friday caused cancellations and flight delays along the US East Coast. The Atlanta FAA computer processes pilots' flights plans and sends them to air-traffic controllers – when it failed, the Salt Lake City center took over, but it became overloaded and temporarily failed as well.

The Atlanta system failure lasted only from 0657 to just before 1100, but the effects, coupled with the effects of the thunder storms that moved from the Midwest to the East Coast, compounded the trouble. Residual effects were still being felt into this morning.

This is the third major computer problem in the past several months. On Friday, 25 May, at the start of the Memorial Day holiday weekend, the mapping software in the San Diego Terminal Radar Approach Control (TRACON) facility used by controllers to guide flights for 21 airports in the Southern California region, failed for about an hour when staff attempted to update the maps.

Then on early Monday morning 5 March, there was a software failure in the ATOP (Advanced Technologies and Ocean Procedures) system that air traffic controllers in New York use to guide aircraft over the Atlantic Ocean. About two dozen flights were affected.

Until the FAA’s latest air traffic control (ATC) modernization effort called NextGen is complete – and that is not scheduled until 2025 according to current projects (and hopes) – and the current fragility of the current ATC computer and radar systems, one can expect more and more of these failures to occur. A complete system meltdown is probable in the next few years if there is a major computer or radar failure on a major travel weekend that happens during a spate of bad weather spanning several regions of the US. Just hope you aren’t flying when that happens.

Continue reading "A System Burp" »

June 17, 2007

Space Ho!

It looks like the six German-made, Russian programmed computers on the International Space Station (ISS) are back up and running after a few days of tense troubleshooting trying to discover the reason why they wouldn’t reboot properly. The computers which control the ISS’s navigation and command and control systems shut down last Wednesday, and there was trouble rebooting them. ( A good time line and incident details can be found at CBS News Space Place.)

These problems had been preceded by problems on Tuesday, where a computer crash prevented the ISS from immediately taking over gyroscopic control as planned from the docked shuttle Atlantis. During the computer rebooting sequence, a false fire alarm on the Russian segment of the ISS was sounded. Later Tuesday night, gyroscopic control was handed back to the ISS computers, although the reason for the computer crash on was not understood. However, only one out of the three navigation and one out of the three command and control computers were working after the successful reboot.

Early Wednesday morning, while astronauts were outside working on retracting a solar array wing, the two remaining computers crashed, and none of the computers would reboot – a first in ISS history. If the computers could not be rebooted, the ISS would potentially have to be abandoned. Making the troubleshooting a bit harder was that the Russian Federal Space Agency Roskosmos does not have its own satellites which can communicate with the ISS, forcing Russian space engineers to wait until the ISS is within line-of-site of Russian ground stations to downlink the needed telemetry to perform troubleshooting.

By yesterday afternoon, the computers were back up and working. There was a belief that there was a problem with the quality of power supply to the computers, possibly caused by the addition of new solar arrays. Russian astronauts used jumper cables to by-pass the computers’ surge protectors, and lo and behold, the computers booted up as normal. While this solution points to the source of the problem, the reasons why remain a mystery. NASA’s space station program manager Michael Suffredini probably summed it up best when he said, “As the station gets bigger, this potential [for problems] continues to grow. I think we’re going to find system sensitivities as we change the space station.”

There are a number of interesting aspects to this story. First, while the computers (and software) were designed to be redundant and independent, the power supplies to them don’t appear to be so. I bet that this issue is going to get a hard look in the next few weeks by NASA and Roskosmos.

Second, this episode will likely mean more consideration for possible unintended consequences to not only the computers but other systems and their interfaces aboard the ISS as it continues to be constructed. Even after all these years in space, surprises can still occur and nothing up there can be seen as ever being easy.

Third, the folks who are working on the Mars program are likely trying to figure out whether there is something they now need to be worried about. A mission to Mars could last well over two years, and any computer problems on that little voyage could spell big trouble.

Fourth, reliable computers are really, really important in space. This crash was not by any means the first, nor will it likely be the last. In 2001, during the shuttle Endeavour’s visit to the ISS, all three of the ISS command and control computers shut down, which was apparently caused by a bad hard drive.

Finally, having a really good tool kit around with lots of patch, jumper cables and spare parts about is priceless. While it often appears to be, not every computer problem is a software problem

July 8, 2007

Will It Ever End for the Folks At Enron?

Some 20,000 ex-Enron workers who finally received their first payment for some of their lost retirement funds were told that they were over-underpaid (12,800 total) or maybe worse over-paid (7,700 total) because of a computer burp. Those over-paid are probably going to have to pay the money back.

Of course, the company involved could just reprogram the software to account for the over-payment/under-payment in the next payment due, but ...

If Not the Bank's Fault, Then Whose?

In another software burp reported last week, some Scotiabank customers in Vancouver, Canada were surprised to find that their pre-authorized payments had been withdrawn twice from their bank accounts.

I personally know the fun that can cause. Many years back, I tried to withdraw $50 at my local bank's ATM. I was informed that this wasn't possible, since my account was overdrawn by roughly $1.4 million. That was news to me. Since I discovered this on a Friday night, I had to stew on it until Monday morning.

A "small software problem" (the bank's terminology) caused my overdraft which in turn meant my pre-authorized payments (like my mortgage) weren't paid on time. It took a good long while to get this mess straightened out, especially with the credit scoring companies who saw that I had missed a whole bunch of payments. Try telling them that it was just a computer error. I stopped using pre-authorized payments after that little episode, as well as changed banks.

Anyway, what caught my eye in the article were some quotes allegedly made from a person at a local university who said that he "wasn't surprised to hear of a technical error with banking systems." Me either - been there.

Continue reading "If Not the Bank's Fault, Then Whose?" »

July 15, 2007

Burps of the Week

After a relatively quiet period, IS&T glitches popped up in several places.

In Japan, sales of some mobile phones made by Sony Ericsson were continued again after they had been stopped on 4 July because of software problems that could erase stored telephone numbers among other items.

A Target department store in San Diego, California saw its point-of-sales system fail for a few hours, which likely cost it tens of thousands of dollars.

A computer problem created headaches for motorists trying to renew their car registrations at the Bureau of Motor Vehicles across Indiana. It had to happen, of course, the day when many registrations expire.

A software error in cable boxes in Burbank, California shut down a cable television system for six hours which affected 35,000 customers.

A lawsuit was filed in West Virgina against telecommunication company FiberNet alleging "breach of contract, fraud and negligence, and seeks class-action status and unspecified punitive and compensatory damages." Seems that a computer problem stopped service for two days to about half of its 24,000 customers, which included hospitals, police, businesses, etc.

And my favorite, five hundred and eighty-seven patients at Northern Cochise Community Hospital in Wilcox, Arizona received inaccurate hospital bills. The new billing software decided to keep adding the bill of the previous patient onto to the next patient's bill. Someone received a bill for $49 million.

Hope that last one didn't put the person back into the hospital.

July 23, 2007

Software Error - Go to Jail

About 30 patrons of the Caesars Indiana casino in Elizabeth, Indiana reportedly might be facing felony criminal charges for winnings that the casino is claiming is not theirs. Seems that there was a software error in the slot machine called Easy Money which registered $10 worth of credit for every dollar inserted. Caesars reported that it had lost $487K over the July 21 weekend.

Turns out this is not a new occurrence. The Majestic Star Casino in Gary, Indiana lost more than $300K in February to the same software problem. Seems strange that the problem wasn't fixed on every machine after that incident, or if it was, maybe the patch caused a new problem with the same result.

Continue reading "Software Error - Go to Jail" »

August 17, 2007

Skype Scuppered

Yesterday morning (about 0900 EDT in the US) the internet phone and messaging company Skype acknowledged that its users were experiencing log-on problems due to a software problem. The problem shut down the service to an unknown number of customers around the world, but it was likely in the millions.

According to the Financial Times of London late this afternoon, the problem has been fixed. Now let the debate begin as to whether this will harm Skype in particular or internet calling in general.

It will be interesting to see how similar this "software problem" will be to the one that happened on 15 January 1990 when AT&T suffered a massive failure of its long-distance service due to a elementary programming error.

August 20, 2007

Best Data Breaches Ever!

eWeek posted an on-line slide show listing the "Most Disastrous Data Breaches" since February 2005. They list 17 of them: 5 caused by outside hacking, 1 by insider theft, 5 by inadvertent posting of information, 5 by devices (laptop, memory stick) being stolen, and 1 caused by data being lost.

One of the seventeen listed was the discount retailer TJX. The company announced last week that the cost of its data breach last year that affected 45.8 million of its customers was likely to exceed $150 million, although given its previous estimates this is probably an underestimate of at least 100% or more. To quote TJX's press release:


In the second quarter of fiscal 2008, the Company recorded an after-tax cash charge of approximately $118 million, or $.25 per share, with respect to the previously announced computer intrusion(s). This charge includes $11 million (after tax), or $.02 per share, for costs incurred during the quarter, as well as a reserve of $107 million (after tax), or $.23 per share, for the Company's exposure to potential losses. This reserve reflects the Company’s estimation of probable losses, in accordance with generally accepted accounting principles, based on the information available to the Company as of August 14, 2007, and includes an estimation of total, potential cash liabilities from pending litigation, proceedings, investigations and other claims, as well as legal and other costs and expenses, arising from the intrusion(s). In addition, TJX expects to incur future non-cash charges of approximately $21 million (after tax), or $.05 per share, that are not included in this reserve and could be recorded in fiscal year 2009. Together, these cash and non-cash charges represent the Company’s best estimate of the total losses the Company expects to incur as a result of the computer intrusion(s).

And people still argue that organizational IT security rules are meant to be broken.

August 22, 2007

Talon Declawed

The US Department of Defense announced that it was shutting down its controversial Talon data gathering program.

Talon was established in 2002 by then-Deputy Defense Secretary Paul D. Wolfowitz as a way to collect and evaluate information about possible threats to U.S. servicemembers and defense civilians at stateside and overseas military installations. It is being closed because reporting to the system had declined significantly, and it was determined to no longer be of analytical value, said Army Col. Gary Keck, a Pentagon spokesman.

A reason for its shut down was noted in an article in Government Executive,

A June 2007 report by the Defense Department's inspector general found that counterintelligence officials "maintained TALON reports without determining whether information on organizations and individuals should be retained for law enforcement and force-protection purposes."

In addition, the article notes that:

To ensure a mechanism to document and examine potential threats, Assistant Defense Secretary Paul McHale plans to propose a new, streamlined reporting system that can better meet the Pentagon's needs, an agency press release said. In the interim, Defense Department officials will send information pertaining to protection concerns to the FBI's Web-based threat tracking system.

What a "streamlined reporting system" means hasn't been explained, but past history says don't place bets that it isn't going to resemble a data vacuum cleaner.

August 25, 2007

LA School System BLUNDER

I have long argued that the IT community needs to separate IT failures from blunders.

Most organizations do not have enough IT project failures. The reason I say this is that, in my experience, most project cancellations (or escalations for that matter) are not true failures but instead represent blunders. There is a big difference. A project failure is one in which most project decisions and actions were correct at the time, but for some reason the project didn't work out. It is a professional project -- the project risks were assessed, managed, and accepted where required; the assumptions were checked; success criteria were defined; the plan was estimated and funded well; the stakeholders participated; and so on.
Project blunders, which I contend most project overruns and cancellations are, arise from Dilbert-like approaches to project management and implementation. There is little or no risk management, the project plan is a fantasy, stakeholder concerns are given short shrift, and on and on.

Well, in a distressingly familiar story in today's LA Times, yet another IT blunder is described. The lede paragraph reads as follows:

Since launching a $95-million computer system six months ago, the Los Angeles Unified School District has been beset by programming glitches, hardware crashes and mistakes by hurriedly trained clerical staff. The result: tens of thousands of teachers, cafeteria workers, classroom aides and others have been underpaid, overpaid or not paid at all.

Sounds like a blunder to me.

Continue reading "LA School System BLUNDER" »

August 26, 2007

Is That Lead in Your Foot?

USA Today ran a small story last week on Nissan Motors plans to equip all of its cars and trucks with a dashboard gauge showing the fuel-efficiency of one's driving. The gauge displays your instantaneously computed miles per gallon as a bar graph - the more fuel efficient you drive, the longer the bar displayed.

Nissan claims that based on its in-house testing drivers will cut their fuel by 10%.

I bet if the price per gallon of gasoline was also displayed, or maybe the IRS standard cost per mile reimbursement rate (currently 48.5 cents per mile) used instead, people would drive even less. Seeing that the drive to the local store ten miles away cost you $9.70 might give you incentive to do it less.

Maybe Nissan will add in a costing feature as well in the future. The average cost per gallon gasoline or a total cost of driving per mile could be broadcast over a preset radio frequency, which then could be used to compute the cost per trip.

Given that Nissan's gauge looks software driven, this shouldn't be too difficult too add.

September 9, 2007

Maybe They'll All Quit

Zalmai Azmi, the FBI's CIO, was reported in Federal Computer Week as saying that, "Cultural differences are the biggest obstacle preventing intelligence agencies from starting information-sharing programs."

He reportedly went on to say that, “The introduction of new blood would help do things differently."

Good luck. I thought that too over thirty years ago when I worked as a junior engineer in the Defense Department. I still hold that same thought today.

I wonder if Azmi is hinting that there may be problems behind the scenes with Sentinel, the follow on to the infamous Virtual Case File system. Information-sharing is a critical aspect of Sentinel.

Is Azmi worried that even if Sentinel is built, FBI agents won't be inclined to use it, or they will find ways to keep information from being shared with other agencies?

September 20, 2007

You've Got To Be Kidding Me

In Allan Holmes's Tech Insider blog over at Government Executive magazine, he quotes part of the testimony of John Glaser, vice president and CIO for Partners Healthcare in Boston given at Senate Committee on Veterans' Affairs regarding how easy it is to share electronic health records (EHRs). When Glaser was asked what the private sector experience was with sharing EHRs at the scale of what the VA and Defense are trying to do, he said:

"A common EHR? That's interesting to me. That's a codeword for, 'You got to be kidding me.'"

This was undoubtedly a splash of cold water on those Senators who think creating inter-operable EHRs is just a matter of a few lines of software code.

September 25, 2007

"One more such victory..."

The Wall Street Journal (subscription required) ran a story today on the implementation of a new Oracle payroll system for Arizona State (there is an earlier story here from California State University). The approach taken was a bit different, or in the words of Emeritus Professor Colin Tully, a planned failure.

On the advice of Arizona State's IT department, instead of a designing and planning for a project that would take an originally estimated 4 years and cost $70 million, the University would instead try to do it for $30 million ($15 million for development, $15 million for maintenance over 5 years) in just 18 months. The reasoning was that since there would be operational glitches whether the project took the 4 years or 18 months, so what the hey, let's go for short approach and fix everything in the wash.

To hit their target date, the University increased the number of programmers and consultants on the project, as well as opted for a very "vanilla" payroll system (i.e., absolutely minimal customization) and if required, Arizona State business processes would be changed to conform to the payroll system requirements rather than vice versa. For instance, instead of being paid on the 15th and 30th of the month, employees would now be paid every other Friday.

Well, the University got exactly what it hoped for - a buggy payroll system in need of fixing.

Continue reading ""One more such victory..."" »

LA School System Update

The Los Angeles Unified School District recently decided to hire a monitor and spend another $10 million to try to remedy its payroll system problem.

It appears few think another $10 million is going to do the trick. Furthermore, with the amount of patching the system is undergoing, I guess that the system is now getting to that precariously fragile state that every new patch risks causing cascading errors in areas of the system thought to be okay.

I wonder how long before the school district figures out that it can't make any major changes to its business procedures without risking a total meltdown?

Probably when new contract talks are held with the School District's employee unions, and District management sees the difficulty of the IT meeting any of the new contract terms and conditions. At that point, I wouldn't be surprised to see the system put out of its misery.

September 27, 2007

Small Glitch

The Seattle Times reported that a new $171K computer system that a Seattle school district used to develop student school bus schedules had a slight software "kink" that led to students riding on the wrong buses or getting off at incorrect stops.

My elder daughter suffered a similar fate a few years ago when, again thanks to a new bit of scheduling software, we were informed that her bus stop had not only moved significantly further away, but was now located underground by some 150 feet.

As happened to us, Seattle parents are finding it hard to get through to the proper authorities to correct the problem. My advice to Seattle parents - it will take a few more weeks to correct - and it will probably happen again next year.

October 7, 2007

LA Unified School Payroll District Saga Continues

The LA Times reported last week that yet another payday passed with erroneous paychecks for employees of the LA Unified School District (LAUSD). The number of errors have remained basically the same over the past three paydays.

The rush is on to try to fix things before the end of the year, when tax forms will be mailed out. This could prove to be a real issue for the thousands of employees who have been overpaid, who could find themselves in trouble with the US Internal Revenue Service and California tax authorities for underpaying their taxes.

The LAUSD is contemplating ".. a plan to designate overpayments as no-interest loans that would not be counted as income." I am not a certified public account (anyone out there who is?), but I would suspect that this information would still need to be reported to the IRS at least by a person receiving this income. And if so, this means at a minimum a tax preparation day headache if additional more tax forms are required to be filed.

LAUSD management "anticipated that the technological glitch at the root of the problem would be fixed before the next payday in November, but left open the possibility that it could take longer."

Bets, anyone?

October 9, 2007

Blaming the Software Again

Last week, the Massachusetts Division of Professional Licensure mailed 28 computer disks to 23 marketing agencies who requested the names of the 450,000 licensed professionals in Massachusetts; unfortunately, the disks also contained the professionals' social security numbers.

As of today, all but one of the disks has been recovered.

According to the Boston Globe, the spokesperson for the the state Executive Office of Housing and Economic Development, which oversees the Massachusetts Division of Professional Licensure blamed it on "a software failure during computer upgrades last month. An employee noticed the error a week later."

Now, if I only had a Euro or Canadian dollar for every time I heard that lame excuse and another for the promise of a thorough review of security procedures to keep it from ever happening again after such a problem occurs.

October 11, 2007

Boeing Dreamliner: Game of Software versus Systems Chicken

Boeing finally admitted yesterday that it was delaying the introduction of the 787 Dreamliner from May 2008 to the end of 2008. Just a few days ago, Boeing was insisting that it would make the May delivery, even if it had to work 24 hours a day to do so.

Along with some production problems having to do with parts availability (e.g.,"from fasteners .. to clips and brackets and small assemblies being provided further down in the supply chain") as well as - drum roll please - software.

The issues with software are "coding and integration." More time is required to let the software "mature" through additional testing.

Quoting from Scott Carson,Boeing Co., EVP, CEO Boeing Commercial Airplanes:

"But let me say this about the overall system integration work. We have had two or three software areas that have been on the critical path right along with the production build of the airplane. .... the software and structures work, were running neck and neck."

"As it became clear that in fact the most critical pacing item was the structures, this has actually given us a little bit of headroom on the software side. We're going to have much more time with the software in the lab, both in terms of maturing the individual software itself but also in integrating the software packages to assure service-ready functionality by the time the airplane flies. So the silver mining in this cloud tied to the structures work is we think it has given us some breathing room that is going to allow the software piece to be much more mature by the time the airplane flies."

Ah, the old game of software versus systems chicken to see who has to blink first, and admit they are more behind than the other guy.

Congratulations!! The software guys win.

How hard do you think the software folks were praying that those fasteners wouldn't be available before their tests were? So even though the software is late (and over-budget?), it doesn't look as bad as if software development were alone the hold-up to the Dreamliner's in-service date.

BTW, a note to the FAA - you may want to do some extra "maturity" checks on that software.

Question of Management

A friend of mine, John Stone, author of Developing Software Applications in a Changing It Environment: Management Strategies and Techniques, recently e-mailed me a question, "After reviewing the IT blunders in your blog, it’s clear that although we continue to make substantial progress in technologies, IT management has made halting progress at best – with a many projects failing in some way and leaving that their companies and users have to cope as best they can."

"The analogy that comes to mind is the US car industry in the ‘60s and ‘70s, when the "Big Three” produced low quality products, leaving their customers to cope however they could. I wonder if we will see a similar exodus in IT – to hungry forward-thinking vendors where labor costs are low and education and quality are high?"

"What do you think your readers' would opine?"

Anyone want to answer John?

October 15, 2007

Census Risk

As reported last week in Government Executive, the US Government Accountability Office (GAO) released a report (GAO-08-79) that discusses the four critical US Census Bureau information technology projects needed to support the 2010 census, and the several that are over budget and behind schedule. The GAO report in addition noted that risk management practice on these Census IT projects is weak.

The Census is fast running out of time to fully field test its new approach using hand-held computers instead of paper-and-pencil methods to gather census information. While the Census is confident that its approach will work at the required time, others, such as myself, are less sanguine.

The Census's approach to managing risk as a whole, and the risk management used by Census contractors responsible for the individual Census IT projects, has not, shall we say, been as good as it could have been. Given that the effort was high risk from the very beginning, and that the results of a census have tremendous economic and political import, the risk management practice was woefully short of what it should have been. For US citizens' sake, let's hope the past management decisions taken at the Census don't lead to a major IT blunder.

October 16, 2007

Software-Supported Ticket Scalping

Los Angeles Federal Judge Audrey B. Collins issued a preliminary injunction yesterday against RMG Technologies, Inc., of Pittsburgh, Pennsylvania ordering the company "to stop creating, trafficking in, or facilitating the use of computer programs that allow its clients to circumvent the protection systems in the ticketmaster.com web site." Users of RMG software, typically ticket brokers and some ticket scalpers, have used it to flood Ticketmaster to obtain large blocks of tickets, denying consumers an opportunity to buy tickets.

According to a Wall Street Journal article, a recent Hannah Montana concert, the retail price of a ticket was $63, but were being sold for an average of $237. For some shows, according to the New York Times, the show's tickets were sold out in 12 minutes, and then appeared on sale for on the internet up to 10 times their face value. Ticketmaster said that for some shows, software "bots" were responsible for as much as 80% of all ticket requests.

October 19, 2007

What Really Happened with the FBI's Virtual Case File System?

As I noted yesterday, I was recently in Washington, D.C. attending a breakfast seminar sponsored by Government Executive magazine on the topic, "What Are the Essential Ingredients for a Successful Large IT Project?" The two gentlemen speaking were Randolph (Randy) Hite, Director, IT Architecture and Systems Issues, U.S. Government Accountability Office and Zal Azmi, Chief Information Officer, Federal Bureau of Investigation. My previous post centered on Mr. Hite's comments, today I'll focus on Mr. Azmi's.

Azmi spoke of the current Sentinel project, the follow-on to the infamous Virtual Case File (VCF) system that failed so spectacularly a few years ago. IEEE Spectrum's Senior Associate Editor Harry Goldstein wrote an in depth story on VCF.

According to the FBI, "Sentinel will consolidate and replace the FBI's legacy case management capabilities with an integrated, paperless file management and workflow system," and be implemented in four phases. Phase I recently completed, and Phase II is ready to begin.

Azmi made the statement that Sentinel is not a technical program, but really a political program. He phrased it interestingly: Sentinel is the Bureau and the Bureau is Sentinel. In other words, the FBI's operations will be centered in Sentinel. If Sentinel fails as a program, the Bureau by implication, fails as an organization.

Azmi's went on to say that he briefs FBI Director Robert S. Mueller III every Wednesday at 2 PM on Sentinel's status, the DCI once a month, and Congressional folks once a quarter on the status of the project. I also know that Azmi holds a risk management review meeting once a day with the prime contractor. If Sentinel fails as a program, no one can say they didn't know its status.

Which brings me to something else Azmi discussed, and that is about his early days on the VCF program. Azmi related that he was asked by Mueller in November of 2003 to look into the status of the VCF program. Azmi said that he had a meeting in mid-November attended by 46 people (I assume FBI management and contractors) who assured him that everything was great. He also said that he was also told that only 68% of the test cases had been performed, and that the software problem reports were increasing, not decreasing. Given that this was only six weeks away from delivery, this was a bit disconcerting.

Azmi also was told by the contractor a few weeks later, that a "draft" version of the software was going to be delivered. Azmi related that he had never heard of that term before, which brought a chuckle from the crowd.

All this was already on the public record.

But then, Azmi said something that really got my ears pointed.

Continue reading "What Really Happened with the FBI's Virtual Case File System?" »

October 31, 2007

What Does Microsoft Do With All That Error Data?

On a "good" day, some 50 gigabytes of error data flows into Microsoft, according to a story in today's Wall Street Journal (subscription required). Two dozen programmers pore over the data, looking for OS kernel and or application problems resulting from design flaws, programming, errors, resource conflicts, and other sorts of programmer and designer ingenuity.

Microsoft won't say where the majority of errors lie or who is at fault, nor give any details about how Vista, XP, Windows 98, Windows 95 all compare, which is too bad. Nor does Microsoft say how errors are prioritized for repair, and whether those two dozen programmers get any say. It also doesn't say how many 50 gigabyte days occur, either.

As I read the story, I got to wondering about those two dozen programmers who look over all the error data coming in. Do they get excited when a big day of error data hits? Do they take bets when the first 60 gigabyte day occurs, or the least busy day of the year is? Do they have a list of known but obscure errors, and then try to guess (err.. predict) when the first time it will show up? Is there a bell that gets rung when it does?

Also, is that position a stop on the way towards bigger and better things, or is it a career path all its own? Is there a title of Chief Error Guru? Do you move from a development team to this error discovery team, or vice versa? After being there awhile, you must get a pretty good education as to what not to do in developing applications or OS kernels. Are those lessons learned promulgated throughout the company and to others in the software community?

Anyone out there who knows, let me know. I'm curious about the dirty two-dozen.

November 1, 2007

How do you spend £12.4bn over 10 years? Start by spending £2.4bn in 10 minutes

The BBC reported last week that the decision to move forward in 2002 with the UK National Health Service's electronic health record's National Programme for IT (NPfIT) took place after a ten-minute presentation to then Prime Minister Tony Blair. The cost estimate for NPfIT - done basically on the back of an envelop - was for £2.4bn over three years, to which Blair basically said, "Go for it."

Surprise, surprise, NPfIT is currently projected to cost £12.4bn over ten years, and even that estimate is likely severely optimistic. Tony Collins over at ComputerWeekly who has been following the NPfIT situation for years has all the gory details. Collins has been trying to get the minutes of the meeting released, which the government refuses to do, despite being directed to do so by the Information Commissioner.

The NHS has recently stated that regardless of the many problems the NPfIT has faced, it is highly successful, and that it is "so well advanced that the health service 'could no longer function' without it."

This is kind of like Homer Simpson saying,“I think Smithers picked me because of my motivational skills. Everyone says they have to work a lot harder when I’m around.”