Spectrum Online—Tomorrows Technology Today
Font Size: A A A

Main

Risk Analysis Archives

June 3, 2007

What and Who Are We?

Here at the Risk Factor blog, our focus is going to be on the risks and rewards of information systems and technology, or more broadly put, the social implications of IS&T. As moderator of this blog, my hope is that we can hold a conversation about what IS&T works and what doesn’t, what past, present and future IS&T trends portend, and, of course, why.

Joining me – Bob Charette – from time to time will be a number of guest bloggers from academia, industry and government who are involved in some of the more important IS&T risk and reward issues of the day. Joining me in this little endeavor are some pretty interesting folks.

There is Peter Ladkin, a Professor of Computer Networks and Distributed Systems in the Faculty of Technology at the University of Bielefeld. Peter specializes in the analysis of safety-related and safety-critical complex heterogeneous systems and their behavior, including accidents.

Next we have Phil Neches, who is one of America's leading technologists and a true database expert, among other things. Phil was Founder, Chief Scientist, and Vice President, of Teradata Corp, and is heavily involved in venture capital investment.

Then there is Peter Neumann, a senior scientist at the SRI International Computer Science Laboratory. Peter, who is the moderator of the ACM Risk Forum newsgroup, has been looking and discussing IS&T risks since nearly the inception of modern computing, is in my opinion the most thoughtful and insightful commentator on the subject.

There is also Martyn Thomas, an expert in large, real-time, safety-critical, software intensive systems. Martyn was the Founder of Praxis, the internationally recognized leader in the use of rigorous software engineering, including mathematically formal methods, is a visiting professor at Oxford University, and is the first person to receive Commander of the British Empire (CBE) award for “services to software engineering.”

Also joining us is John Stone, a Strategy Executive at the consulting firm Monroe Partners. John has worked in and written on all aspects of IS&T across a wide-variety of industries, and brings a wealth of practical knowledge and experience in what it takes to create successful large-scale IS&T projects and programs.

Finally, there is Ed Yourdan, a recognized expert witness and computer consultant who specializes in project management, software engineering methodologies, and Web 2.0 development. For the couple of you who don’t recognize the name, Ed is one of the most influential voices and keen observers of what is happening in the IS&T industry.

I think you’ll agree, the folks above provide a pretty good initial set of eyes on the risks & rewards that IS&T create. Over time, II will be asking more guest bloggers involved in different parts of the IS&T field to join us to continue to enrich the conversation.

Machine Readable Information

An article that caught my eye a few weeks back was the announced acquisition of the Reuters Group by the Thomson Corporation for over $17 billion. The combined companies would create the largest financial news provider.

More interestingly to me than the acquisition itself is the potential impact on future stock market trading. About one-third of stock market trading is currently performed through program or automatic trading. During the week of 14 – 18 May, for example, the New York Stock Exchange reported that “program trading amounted to 35.3 percent average of NYSE daily volume of 3,233.2 million shares, or 1,142.9 million program shares traded per day. This included program trading associated with the May 18 monthly expiration of stock-index options and futures.”

Program trading is inherently “backward looking” in the sense that the trades are automatically made based on price fluctuations that meet certain criteria. The focus in recent years has been on increasing the speed of such trades.

However, both Reuters and Thomson have been working on what is generally called machine readable news, for instance, a “Reuters system will 'read' news articles and score how positive or negative they are. The system will enable customers to analyse news across thousands of companies, far more quickly than can be done by humans. This will enable trading machines to react to market moving news in milliseconds.” Not only are current news stories being made “machine readable,” but Reuters is making its archives machine readable as well.

The Financial Times reports that Thomson has developed software that can automatically “generate the stories work so fast an earnings story can be turned around within 0.3 seconds of a company making results public.” In addition, as noted in the FT story, program trading, “… is set to rise much further in the coming years as fund managers, along with brokers and exchanges, strive for ever-greater speed and control over the trading cycle amid heightened market competition and consolidation.”

The combination of incredibly fast automatic news generation along with historical data to create predictive market responses to such news may create some interesting program market trading impacts. It will be interesting to see, as machine readable news becomes more available, whether the market becomes more volatile as a result, or whether dangerous feed-forward loops are produced during boom times, or more likely, individuals or governments will make use of this capability to deliberately hoax financial markets for either personal or strategic gain.

A government run news agency, for instance, could find it in its self-interest to plant a financial story, say involving some scarce resource – say petroleum – which could cause a panic in the market. By studying the conditions that caused market panics in the past, it might turn into a potential non-military but very effective weapon. Maybe governments (and the exchanges) may want to start thinking about how financial companies could use all this information for not only creating financial rewards, but how others could manipulate it to create major financial risks.

June 6, 2007

Whose Risk?

A nice little controversy concerning risk and IT systems has been brewing in the UK. As first reported by ComputerWeekly, government officials are ordering the destruction of what are called Gateway review reports. A Gateway review is “a ‘peer review’ in which independent practitioners from outside the programme/project use their experience and expertise to examine the progress and likelihood of successful delivery of the programme or project. They are used to provide a valuable additional perspective on the issues facing the internal team, and an external challenge to the robustness of plans and processes.” There are several “gateways” an individual UK government IT project is supposed to pass during its life, starting with Gateway 1 (Business Justification) to Gateway 5 (Operations Review & Benefits Realisation).

The reviews are meant for internal project consumption only, but there has been a long-standing demand by newspapers like ComputerWeekly and government critics to make the results of these reviews public. The Gateway reviews of two major UK IT projects in particular – the National Health Service electronic medical record project National Programme for IT (NPfIT) and the National Identity Scheme’s Identify Cards Programme – both of which are highly controversial, costly, and in trouble.

Supporting ComputerWeekly’s bid to have the Gateway Reports made public has been a ruling by the UK government’s Information Tribunal, an organization that hears appeals regarding whether government information should be publicly released or not, stating that the public interest trumps the desire of the government agencies to keep the reviews private. The UK Parliament’s Public Accounts Committee (PAC) also supported their disclosure.

However, the government – through the Office of Government Commerce (OGC) – which oversees the Gateway review process, insists that making these reports public would fundamentally undermine their use. The OGC claims that IT program management would not get open and honest appraisals of their programs if the people involved knew that there private opinions would be made public.

I can sympathize with that view. Having conducted hundreds of risk assessments over my career and many high profile government ones at that, there is something to be said for confidentiality. I promise confidentiality to programs as a matter of policy myself. Public disclosure will put people on their guard, and the tendency is for you to get optimistic, rather than realistic, estimates of the state of the project’s problems and risks.

When I was involved in the US DoD Tri-Service Assessment Initiative (TAI), program managers were the sole owners of the assessment reports. They could disclose them as them pleased. Our advice to program managers was they should disclose the reports as widely as possible, since for the most part, many of the probelms and risks they faced were created by events and situations outside of their control, and which they needed outside help to address. What we did do, however, was to take the results of every project assessment, sanitize the results, and conduct analysis on the aggregate to try to discover systemic issues that were plaguing most DoD programs.

On the other hand, the public does have a right to know of the technical, financial, and social risks being taken in their name. Both NPfIT and the Identity Card programs will affect every person in the UK, and both not only have seen major cost increases, but there are major issues of privacy protection involved.

Also undercutting the OGC’s arguments somewhat is that many IT projects ignore the results of the Gateway reviews, including some that should never have been initiated or should have been cancelled more than once. Further, a report yesterday by the PAC on Delivering Successful IT-enabled Business Change states that many senior managers responsible for major IT programs are inexperienced, don’t pay much attention to the programs they are responsible for, and don’t seem to care much about the Gateway review or other risk reviews of their programs.

Also, one can’t help wondering whether the real reason that the OGC is so adamant about not wanting to make Gateway review reports public is plain, old embarrassment. As the US FBI found out with its Virtual Case File (VCF) project, not taking the warnings of outside reviewers seriously can end up making you a poster child of poor judgment, an eternal business case study, and also a laughing stock to all your peers.

It will be interesting to watch how the little rhubarb in the UK ends up. But it does raise a set of questions about the public’s right to know about the risks posed by large, government IT projects. How much should be disclosed? How does a program or project manager get honest opinions on the state of their project if everything can be disclosed? And don’t most government program managers have too many backseat drivers and second guessers in trail already?

June 11, 2007

What You Asked For But ....

The controversy over the drug-resistant TB patient Mr. Andrew Speaker who flew back to the US from Europe over his doctors’ objections, and his ability to enter the US even though he was on a travelers’ watch list, illustrates the very old IS&T designer admonition to users that, “It may be the system design you specified, but it isn’t what you wanted or needed.”

As you may recall, Mr. Speaker flew to Montreal from Prague and then drove into the US at the Champlain, New York border as a deliberate means to by-pass the likelihood that would be kept from flying directly back to the US from Europe because he would be on the US “no fly list.” Although the US Customs and Border Protection inspector saw that there was an alert on Mr. Speaker stating that if he should try to re-enter the US, Speaker should be detained and isolated, and public health officials immediately contacted. Instead, the inspector ignored the warning and waved Speaker through because, according to reports, “he didn’t look sick.”

As additionally described in a Washington Post story, US Custom and Border Protection “ … officials testified that they caught the inspector's error only by a mix of caution and luck, because starting May 22 they had ordered a special, twice-a-day check of a database of airline reservations to see if Speaker had changed his expected June 5 return to the United States.

As it turns out, the database is linked to records that also show when a passport flagged by authorities has been swiped at a border crossing, as Speaker's did when he reentered at 6:18 p.m. on May 24.”

The Post story goes on to quote US Customs and Border Protection Commissioner W. Ralph Basham, as saying, “I'm not going to sit here and say the system worked. It may have worked the way it was designed, but it was not good enough.” No kidding.

To reduce the possibility of something like this happening again, US Custom and Border Protection officials are now saying they are putting new procedures in place. Of course, this won’t keep highly infectious and multi drug-resistant TB out of the US, which Nils Daulaire, president of the Global Health Council argues, requires a more active risk management approach to attack TB at its source.

To me, the risk of a single point of failure like a Border official ignoring a warning is symptomatic of what happens in many information system designs. Few IT systems are ever examined in depth after they are deployed for their operational limitations until after an incident like the one occurs. And in my experience, most limits turn out to be, as described by Harvard Business School professor Max H. Bazerman and INSEAD professor Michael D. Watkins, “predictable surprises.”

I'll be interested in seeing whether this event will trigger a wider review of the limitations of the Custom and Border system as well as its systemic role in being able to manage the risks of travelers having infectious diseases, but my expectations are not high for this happening any time soon.

June 13, 2007

Cost Benefit

There is an interesting paper written by Dan Geer appearing on the ACM Queue website titled, “The Evolution of Security” concerning the management of IS&T security risks. In 2003, you may remember, Geer published a controversial paper about the potential security problems of computing monocultures and Microsoft in particular as an example, which got Geer fired from his job at @stake.

Geer makes a number of good points in his paper but the one I especially liked was his spelling out the clear differences between cost benefit and cost effectiveness, to wit:

“…. where cost-benefit asks whether you would rather have the money or the benefit, cost effectiveness assumes that you will, indeed, spend the money and thus your interest is in how much benefit you can get for your money, not whether you would rather keep your money in the first place. This means asking questions such as, ‘Would you save more lives by spending the $10 billion on safer cars or on law enforcement?’ ‘Would you get better availability by spending the $1 million on 10 percent uptime or on instant recovery?’ ‘Would your own pursuit of happiness lead you to spend $100 on one fine dinner or on 20 lunches?’

CE is always tractable; CB is tractable only when the conversions of benefits to dollars are stable and noncontentious. To be blunt, CE is worth doing and CB is not. CE is decision support; CB is self-congratulation. If we are doing risk management rather than contemplating our navel or pandering to the electorate, then we must make decisions about allocating scarcity. We must remember that the purpose of risk management is to improve the future, not to explain the past.” Geer attributes this last sentence to Daniel Borge in his book, The Book of Risk.

Geer’s article is a good reprise of some of the fundamental issues of investing in risk management, and should be read. Once you have read it, you may want to look at yesterday’s column by Cindy Skrzycki in the Washington Post titled, “Does Cost-Benefit Matter?” Her column is on a recent report by AEI-Brookings Joint Center for Regulatory Studies on the use of cost benefit by the US government to determine whether governmental regulations should or should not be put into place. As she notes, “The practice of estimating the costs and benefits of U.S. government regulations is ‘frequently done poorly,’ with scant evidence that it makes a difference on policymaking.” You can download the AEI-Brookings report which is titled, Has Economic Analysis Improved Regulatory Decisions?, here. This report, together with Geer’s article, give a good sense of why cost benefit is difficult to do, and may not be the best measure for managing risk.

July 2, 2007

Health Information on the Web

In yesterday's London Telegraph, there was a story on how the NHS had a new website that was meant to help people understand their health risk. As the story describes, your risk was more of a function of where you lived (i.e., your post code), than your lifestyle or genetics. A 40-year old woman living in central London was most likely to be hospitalized for breast cancer, but if she moved to Manchester, it would be for gynecological issues. Interesting, but useless from an individual decision making point of view. Or, as it was put in the story," the British Medical Association (BMA), accused the Government of offering patients 'totally misleading and useless' information which only increases anxiety."

The article brings up once more the issue of the Web and its value in providing health information, as well as whether this information really informs or worse mis-informs patients when they are trying to understand the risk(s) of a particular disease or treatment. There is an intersection of IT as information purveyor, health care, business and ethics, and risk analysis and management that has not been well explored, but definitely needs to be.

July 8, 2007

Life Imitates Art?

A couple of years back, I wrote a story for IEEE Spectrum on Why Software Fails. I opened with the story that has been floating around the software business for the past twenty years about the disappearing warehouse. Well, yesterday I read a story in the Wall Street Journal about another "disappearing" warehouse - this time to help hide accounting fraud.

Continue reading "Life Imitates Art?" »

August 1, 2007

Predictions of Risk

There are reports tonight of a bridge collapse in Minneapolis, Minnesota. As I write this, the number of dead and injured is unknown.

The reason I add it to this to a blog on IS&T failure and success is that recently I spoke with Dr. Henry Petroski, professor of civil engineering at Duke University on success and failure of design, as articulated in his recent book, Success Through Failure. Dr. Petroski has written extensively on the history of bridge failure, and one of his predictions using historical evidence is that about every 30 or years or so, there is a major bridge collapse that surprises everyone. We are/were overdue for one.

It is too early to tell yet why this bridge collapsed, which is about 40 years old from news reports. But we shouldn't be surprised if it turns out that it was because of a design flaw hidden in plain sight.

Continue reading "Predictions of Risk" »

August 29, 2007

Small Things Can Lead to Big Risks

While not an IS&T related story, it is interesting from a speculative risk perspective. The London Telegraph had a nice little story on the auction of the key "believed to have fitted the locker that contained the binoculars for the crow’s nest."

As the story notes,

It is thought to have fitted the locker that contained the crow's nest binoculars, vital in detecting threats to the liner lurking in the sea in the pre-sonar days of 1912.
Catastrophically for the Titanic and the 1,522 lives lost with her, the key's owner, Second Officer David Blair, was removed from the crew at the last minute and in his haste forgot to hand it to his replacement.

This story should be a reminder that a small event in conjunction with a series of other improbable events can easily lead to disaster.

Without access to the glasses, the lookouts in the crow's nest were forced to rely on their eyes and only saw the iceberg when it was too late to take action.

BTW, the key is expected to bring between $125 K to $150K at auction.

September 6, 2007

A Masterclass in Bad Decision-making

The UK Public Accounts Committee (PAC) published its report regarding the The Delays in Administering the 2005 Single Payment Scheme in England. The delays are estimated to cost UK taxpayers some £500 million.

As reported in the London Times, "the Single Farm Payment Scheme, introduced two years ago, aimed to pay farmers for their stewardship of the land rather than the number of animals they reared for meat."

The Times went on to say that Edward Leigh, the Tory MP who chaired the review committee, said the farmers' payment project was “a masterclass in bad decision-making, poor planning, incomplete testing of IT controls, confused lines of responsibility, scant objective management information and a failure by the management team to face up to the unfolding crisis.” Sounds like a classic IT blunder to me.

The PAC report listed some 15 lessons learned, or maybe better put, not learned. As an example, this is from number 14:

"The implementation of the single payment scheme was subject to four Office of Government Commerce Gateway Reviews between May 2004 and February 2006, and three of these Reviews assessed the programme as "red". Development work on the computer system nevertheless continued and no contingency plan was invoked, despite limited confidence that the system would be ready on time. If 'red' reviews are to be taken seriously, departments need to be explicit about the circumstances in which they would lead to fundamental review or termination of a project."

Maybe the first lesson is to teach senior government IT managers that red means stop, green means go. Or maybe better, test them to see if they are color blind.

September 9, 2007

Maybe They'll All Quit

Zalmai Azmi, the FBI's CIO, was reported in Federal Computer Week as saying that, "Cultural differences are the biggest obstacle preventing intelligence agencies from starting information-sharing programs."

He reportedly went on to say that, “The introduction of new blood would help do things differently."

Good luck. I thought that too over thirty years ago when I worked as a junior engineer in the Defense Department. I still hold that same thought today.

I wonder if Azmi is hinting that there may be problems behind the scenes with Sentinel, the follow on to the infamous Virtual Case File system. Information-sharing is a critical aspect of Sentinel.

Is Azmi worried that even if Sentinel is built, FBI agents won't be inclined to use it, or they will find ways to keep information from being shared with other agencies?

September 11, 2007

Shooting the Messenger

One of the first things one learns as a risk analyst is that you better develop a tough skin. No one wants to hear about potential problems, and some people, as today's story in the Wall Street Journal (subscription required) points out, can get down right nasty about it.

In one example, the CEO of a software company got so angry about a a product continuing to slip its schedule, that he decided to make an example and fired the VP who told him about the latest slip. The CEO wanted to send a message, in other words: "I finally got so exasperated that I let the word go out that I simply did not want to hear any more 'excuses' about why the schedule could not be met".

Low and behold, no one was willing to be the next messenger ("No one ever came forward to tell me the truth about the status of the program for a very long time thereafter") which in the end, he admitted, cost the company even more time and money since no one was making decisions with objective information.

Funny how that CEO was evidently surprised by this - shoot people with bad news, get no more bad news, but reality still bites you in the butt anyway!

One of favorite maxims in relation to risk communication is by the late Nobel Prize winning physicist Richard Feynman who said in the relation to the NASA Challenger disaster and NASA's reluctance to hear bad news, "For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled."

Again, you can shoot all the messengers you want (or embrace only those who bring you good news), but what is - is. Live with it.

September 12, 2007

Little Bits of Chaos: Systems Going Bad

"We don’t need hackers to break the systems because they’re falling apart by themselves,” said Dr. Peter Neumann in an New York Times article, "Who Needs Hackers?" discussing how IT systems are falling apart. Peter and several others discuss the increasing complexity of IT systems today, and how system design and development haven't been keeping up, often as a matter of convenience more than lack of knowledge (which I also argue in my IEEE Spectrum article on "Why Software Fails.")

Some 19 years ago to almost the day (11 September 1988), the NY Times published a story titled, "In Computer Behavior, Elements of Chaos." In this article, the late Dr. Alan Perlis postulated that the break down in networks that were occurring with greater regularity during the late 1980s, "lies in the inevitable disparity between the real world and the models used to simulate it. Even the finest computer simulation is only an approximation. At some point that cannot be determined in advance, the discrepancies between reality and the computer's simplified world view will lead to a chaotic breakdown."

"The only way we can improve our systems is to be prepared to continually redesign them when they fail - which they almost certainly will."

Some things never seem to change, eh?

September 16, 2007

Do You Know the Meaning of NO Review?

Homer Simpson: Facts are meaningless. You could use facts to prove anything that's even remotely true!

Last week, Sir Derek Wanless delivered his second review in the past five years on the UK National Health Service's efforts at modernization. According to the London Times, Wanless found that even after spending an additional £43 billion:

The money poured into the NHS has failed to produce a more efficient service, or to reduce unhealthy lifestyles.

As a result, more money will be needed.

The Guardian newspaper reported that Sir Derek's report included, " a warning that slow progress on introducing new IT systems could seriously undermine the productivity gains envisaged in 2002." He recommend that, ".. the £12bn programme run by the NHS agency Connecting for Health should undergo detailed external scrutiny to ensure the benefits will outweigh the costs."

Continue reading "Do You Know the Meaning of NO Review?" »

September 20, 2007

You've Got To Be Kidding Me

In Allan Holmes's Tech Insider blog over at Government Executive magazine, he quotes part of the testimony of John Glaser, vice president and CIO for Partners Healthcare in Boston given at Senate Committee on Veterans' Affairs regarding how easy it is to share electronic health records (EHRs). When Glaser was asked what the private sector experience was with sharing EHRs at the scale of what the VA and Defense are trying to do, he said:

"A common EHR? That's interesting to me. That's a codeword for, 'You got to be kidding me.'"

This was undoubtedly a splash of cold water on those Senators who think creating inter-operable EHRs is just a matter of a few lines of software code.

September 23, 2007

"No press interest anticipated."

The Washington Post has a deeply disturbing article on the six wayward nuclear cruise missiles of a few weeks ago. A cascading chain of not followed safety procedures led to the nuclear missiles to be loaded onto a B-52 bomber and flown unnoticed across the country in direct violation of 40 years of national policy.

As I mentioned in my previous post on the subject, it appears that risk management had become routine and therefore incredibly sloppy, even though weapons of mass destruction were involved. If some rightly worried military personnel had not leaked the episode to the Military Times, the whole thing may have never seen the light of day. The US Air Force even thought that the event was not going to cause much of public furor, hence "No press interest anticipated." I think the Air Force Public Affairs Office needs a bit of a reality check if they thought that loose nukes were a non-public interest item.

From a risk management standpoint, what irritates me most is that this was a classic "predictable surprise." The article describes how that the Air Force was warned in 1998 of "diminished attention for even 'the minimum standards' of nuclear weapons' maintenance, support and security;" the Air Force Inspector General found in 2003 found that half of the "nuclear surety" inspections conducted that year resulted in failing grades, the worst performance in probably 50 or more years; and; in 2006, the Air Force eliminated a separate nuclear-operations directorate known informally as the N Staff, which closely tracked the maintenance and security of nuclear weapons in the United States and other NATO countries.

The Air Force claims that the N Staff functions were still being done by other Air Force units, but I doubt that these other units viewed their newly acquired mission as a high priority, given the daily stress of dealing with the wars in Iraq and Afghanistan.

The Associated Press reported that Secretary of Defense Robert Gates has asked for an outside review of the incident by the Defense Science Board be conducted on top of the one being conducted internally by the Air Force. While the outside review is said not to be a reflection on whether the Air Force will conduct an honest review, it is hard to read it as other than a "trust, but verify" decision.

PS - Happy 60th Anniversary to the US Air Force.

IT Mercy Rule

Rule 4.10 (e) of Little baseball states that: "If one team has a lead of 10 runs or more after the game becomes a regulation game, the game is over."

Sen. Thomas Carper, D-Del. has suggested during last week's Senate Homeland Security and Governmental Affairs Subcommittee on Federal Financial Management, Government Information, Federal Services and International Security hearings on High Risk IT: Is Poor Management Leading to Billions in Waste?, that something akin to the mercy rule needs to be invoked on government IT projects, according to Government Executive magazine. Carper reportedly said that, "Some of these [IT] projects can be extremely difficult to manage, and mistakes may be made along the way. But there are times when maybe we should accept our losses and end a failing project before we waste even more hard-earned taxpayer dollars."

According to the article, Karen Evans, the Administrator of the Office of Electronic Government and Information Technology (IT) at the Office of Management and Budget (OMB), told the Senators that OMB list, tracking expensive IT projects that require special attention from top management due to their complexity or potential risk, went from 447 to 553 projects since February.

Evans, trying to positively spin the 19% uptick in this way: "Those figures can be misleading because not all projects on the High Risk List are technically 'at risk.'"

"A successfully performing project may still be classified as high-risk due to exceptionally high costs and or complexity. For example, all e-government initiatives have been determined to be 'high risk' and therefore are reported on agency quarterly reports."

Let me clue OMB in on something - if these high risk IT projects aren't "really" high-risk but OMB shows them as high-risk, OMB at the very least has a big risk measurement problem that needs to be immediately fixed.

Furthermore, given how OMB assesses high-risk IT projects, it is much more likely that OMB is significantly under-counting versus over-counting the actual number of high-risk government IT projects.

It has been absolutely clear for the past several years that OMB is clueless when it comes to IT project risk, and that Sen. Carper is on the right track: kill off under-performing IT projects sooner, not later. It is the only merciful thing to do both for the project participants and taxpayers.

Calling the "A-Team" for a Low-Risk Project

Speaking of the IT mercy rule, for the US Department of Homeland Security (DHS) Secure Border Initiative's SBInet Project 28, they now appear to be 8 runs down and its the bottom of the fifth inning.

DHS Secretary Michael Chertoff told the House Committee on Homeland Security that payment to the Boeing Company, the prime contractor, was being suspended until it can prove that the "virtual fence" can be made to work. Seems that there is a "software glitch" and some integration issues that are causing problems.

Chertoff, however, is reportedly confident now that Boeing has "retooled their team on the ground and replaced some of the managers. ... They are now working through the problems of system integration as we speak. I think they put their A-team in place to do it."

Ahem, did this mean that Boeing has been using its B-Team?

Continue reading "Calling the "A-Team" for a Low-Risk Project" »

September 25, 2007

LA School System Update

The Los Angeles Unified School District recently decided to hire a monitor and spend another $10 million to try to remedy its payroll system problem.

It appears few think another $10 million is going to do the trick. Furthermore, with the amount of patching the system is undergoing, I guess that the system is now getting to that precariously fragile state that every new patch risks causing cascading errors in areas of the system thought to be okay.

I wonder how long before the school district figures out that it can't make any major changes to its business procedures without risking a total meltdown?

Probably when new contract talks are held with the School District's employee unions, and District management sees the difficulty of the IT meeting any of the new contract terms and conditions. At that point, I wouldn't be surprised to see the system put out of its misery.

October 7, 2007

Space Station's Computer Failure: It Was Inevitable

James Oberg reports in an IEEE Spectrum webcast a very important story on the background to the NASA computer failure that occurred in June. Oberg stories states that, "The critical computer systems ... had been designed, built, and operated incorrectly—and the failure was inevitable. Only being so relatively close to Earth, in range of resupply and support missions, saved the spacecraft from catastrophe."

The problem was a cable short-circuit caused by moisture build-up, likely itself caused by a malfunctioning dehumidifier. But as Oberg writes, the short-circuit should not have caused the problems it did. "..in a shocking design flaw, there was a “power off” command leading to all three of the supposedly redundant processing units. The line was designed to protect the main computers, which are downstream of the power monitor, from power glitches too great for normal power filters to protect against. It does so by turning the computers off when it senses trouble. But in a failure unanticipated by its designers, this one command path itself was able to kill all three processing units due to a single corrosion-induced short."

As Oberg noted, if this happened on the way to Mars, it would likely have resulted in loss of the crew. What's worse, was the instinctive reaction of those involved to look for assigning blame instead of looking for the root cause of the problem, or a means to mitigate it.

Everyone interested in risk assessments, communication and management should read it.

October 15, 2007

Census Risk

As reported last week in Government Executive, the US Government Accountability Office (GAO) released a report (GAO-08-79) that discusses the four critical US Census Bureau information technology projects needed to support the 2010 census, and the several that are over budget and behind schedule. The GAO report in addition noted that risk management practice on these Census IT projects is weak.

The Census is fast running out of time to fully field test its new approach using hand-held computers instead of paper-and-pencil methods to gather census information. While the Census is confident that its approach will work at the required time, others, such as myself, are less sanguine.

The Census's approach to managing risk as a whole, and the risk management used by Census contractors responsible for the individual Census IT projects, has not, shall we say, been as good as it could have been. Given that the effort was high risk from the very beginning, and that the results of a census have tremendous economic and political import, the risk management practice was woefully short of what it should have been. For US citizens' sake, let's hope the past management decisions taken at the Census don't lead to a major IT blunder.

October 18, 2007

Be Realistic - Yeah, Right

I was in Washington, D.C. yesterday attending a breakfast seminar sponsored by Government Executive magazine on the topic, "What Are the Essential Ingredients for a Successful Large IT Project?" The two gentlemen speaking were Randolph (Randy) Hite, Director, IT Architecture and Systems Issues, U.S. Government Accountability Office and Zal Azmi, Chief Information Officer, Federal Bureau of Investigation. It was an interesting session for a number of reasons. In this post, I'll concentrate on what Mr. Hite had to say.

Hite was asked off the bat whether he thought that IT project management in the Federal government had improved over the past few years. Hite said that be believed that it had. He cited that the number of IT projects on both the Office of Management and Budget watch and high risk lists have been steadily declining.

OMB evaluates IT project plans to see, in Hite's words, "Whether they are well-positioned to execute." The OMB watch list highlights projects that, in OMB's opinion, have "weaknesses" in their capital budget and planning submissions, while those projects on the high risk list are those requiring "special attention" from the highest level of management because they may be very costly or mission critical.

However, Hite also placed a very large caveat on his belief that things have improved: he said that GAO audits have found that many of the IT projects don't have any data to support that their contention that they are "just fine, thank you."

When Hite said this, it was hard not to laugh out loud. What he just said, in effect, was that government program managers have quickly learned to adapt in the face of increased oversight: they now know how to "game" their budget submissions to OMB and hiding potential weaknesses that might gain them more management attention from above. Give credit where credit is due: government IT project managers are good at figuring out how talk a good game.

Hite had more to say.

Continue reading "Be Realistic - Yeah, Right" »

October 19, 2007

What Really Happened with the FBI's Virtual Case File System?

As I noted yesterday, I was recently in Washington, D.C. attending a breakfast seminar sponsored by Government Executive magazine on the topic, "What Are the Essential Ingredients for a Successful Large IT Project?" The two gentlemen speaking were Randolph (Randy) Hite, Director, IT Architecture and Systems Issues, U.S. Government Accountability Office and Zal Azmi, Chief Information Officer, Federal Bureau of Investigation. My previous post centered on Mr. Hite's comments, today I'll focus on Mr. Azmi's.

Azmi spoke of the current Sentinel project, the follow-on to the infamous Virtual Case File (VCF) system that failed so spectacularly a few years ago. IEEE Spectrum's Senior Associate Editor Harry Goldstein wrote an in depth story on VCF.

According to the FBI, "Sentinel will consolidate and replace the FBI's legacy case management capabilities with an integrated, paperless file management and workflow system," and be implemented in four phases. Phase I recently completed, and Phase II is ready to begin.

Azmi made the statement that Sentinel is not a technical program, but really a political program. He phrased it interestingly: Sentinel is the Bureau and the Bureau is Sentinel. In other words, the FBI's operations will be centered in Sentinel. If Sentinel fails as a program, the Bureau by implication, fails as an organization.

Azmi's went on to say that he briefs FBI Director Robert S. Mueller III every Wednesday at 2 PM on Sentinel's status, the DCI once a month, and Congressional folks once a quarter on the status of the project. I also know that Azmi holds a risk management review meeting once a day with the prime contractor. If Sentinel fails as a program, no one can say they didn't know its status.

Which brings me to something else Azmi discussed, and that is about his early days on the VCF program. Azmi related that he was asked by Mueller in November of 2003 to look into the status of the VCF program. Azmi said that he had a meeting in mid-November attended by 46 people (I assume FBI management and contractors) who assured him that everything was great. He also said that he was also told that only 68% of the test cases had been performed, and that the software problem reports were increasing, not decreasing. Given that this was only six weeks away from delivery, this was a bit disconcerting.

Azmi also was told by the contractor a few weeks later, that a "draft" version of the software was going to be delivered. Azmi related that he had never heard of that term before, which brought a chuckle from the crowd.

All this was already on the public record.

But then, Azmi said something that really got my ears pointed.

Continue reading "What Really Happened with the FBI's Virtual Case File System?" »

October 21, 2007

Deja Vu All Over Again

Last May, you may recall, TB patient Mr. Andrew Speaker flew back to the US from Europe over his doctors’ objections, and was able to enter the US even though he was on a travelers’ watch list. To reduce the possibility of something like this happening again, US Custom and Border Protection officials said that they were putting new procedures in place.

Well, last week it was disclosed that a Mexican national with multi-drug-resistant tuberculosis boarded 11 flights, at least one to the United States and crossed the US border a total of 76 times. Customs and Border Protection (CBP) officials were warned on April 16 that this person was infected, but it took the Department of Homeland Security until June 7 to warn the inspectors on the border and the Transportation Security Administration to add this traveler to the travelers' watch list.

So there were actually two incidents, one highly publicized and one not, happening simultaneously. During the Speaker incident, DHS said that it was inexcusable what happened.

However, it is very clear that given the bad publicity of the Speaker case, senior DHS officials deliberately tried to keep this other traveler off the watch list until things quieted down a bit. The DHS, surprise, surprise, is not commenting on this latest "oops".

As I wrote before, I was skeptical that the Speaker incident would trigger a wider review of the limitations of the Custom and Border automated travelers' watch system as well as its systemic role in being able to manage the risks of travelers having infectious diseases. I guess I was more correct than I knew, unfortunately.

October 22, 2007

Who Owns You, Baby?

An interesting article was published in today's LA Times on a federally-funded identity-theft study performed by the Center for Identity Management and Information Protection (CIMIP) located at Utica College in New York. The study says that contrary to popular belief, about half of identity theft is performed by strangers, not family or acquaintances, as reported by others like Javelin Strategy & Research and ID Analytics. Both have strongly suggested (here and here) that on-line id theft was overblown, and that consumers shouldn't be worried about it.

Javelin said that the CIMIP study didn't contradict their work (which is funded by Visa USA, Wells Fargo & Co., and others with a vested interest in promoting on-line transactions) because the CIMIP study focused "on high-dollar cases" which would "more likely to involve businesses, strangers and technology" than their broad base of consumer victims reached through telephone surveys.

Okay, sure.

Anyway, I think it is going to take some time sorting out who is at risk by whom, but regardless, on-line or off, it isn't getting any safer out there.

November 15, 2007

FBI Virtual Case File Opportunity Cost?

A Lebanese-born CIA officer and former FBI agent Nada Nadim Prouty pleaded guilty this week to charges that, among other things (like submitting forged documents to obtain American citizenship) she illegally sought classified information from FBI computers in September 2002 and June 2003 concerning the Islamic group Hezbollah.

According to the New York Times, the agent's sister and brother-in-law "attended a fund-raising event in Lebanon in August 2002 at which the keynote speaker was Sheikh Muhammed Hussein Fadlallah, the spiritual leader of Hezbollah. Sheikh Fadlallah has been designated by the United States government as a terrorist leader." She checked the FBI computers to see what information law enforcement had on relatives, as well as herself.

It is interesting to speculate whether Prouty would have dared to check the FBI files in June 2003 if the Virtual Case File was visibly on track to be completed on-time (December 2003 or June 2004, take your pick), and or whether her 2002 or 2003 snooping would have also been discovered in 2004 before she went to the CIA, not 2007.


November 18, 2007

Subtle Chip or Apllication Math Errors Can Lead to Big Problems

Over the weekend, the New Yorks Times ran an article on a potential IT security problem posed by errors in microprocessor chips such as the Intel Pentium error of a few years back or the recent Microsoft Excel spreadsheet bug.

Adi Shamir, a professor at the Weizmann Institute of Science in Israel and one of the three designers of the RSA public key algorithm, circulated a research note about how an attacker could exploit an undetected subtle math error and make breaking public key cryptography possible.

The Times article notes that Mr. Shamir believes that "if an intelligence organization discovered a math error in a widely used chip, then security software on a PC with that chip could be 'trivially broken with a single chosen message.' Executing the attack would require only knowledge of the math flaw and the ability to send a 'poisoned' encrypted message to a protected computer. It would then be possible to compute the value of the secret key used by the targeted system. With this approach, 'millions of PC’s can be attacked simultaneously, without having to manipulate the operating environment of each one of them individually.' "

It isn't believed that this technique is being used - yet. It still seems easier to poison PC components themselves like hard drives at the factory, which recently happened to Seagate Maxtor drives made in Thailand and which were pre-loaded with password stealing Trojan horses.