30 January 2014

SMotW #90: % of business units with proven I&A

Security Metric of the Week #90: proportion of business units using proven identification and authentication mechanisms

This metric hinges on the meaning of "proven". Proof is a relative term. What level of proof is appropriate? It's a matter of assurance, trust and risk.

ACME managers implicitly assumed* that the metric would be self-measured and reported by business units. Given a central mandate from HQ to implement specific controls, business units are obviously under pressure to confirm that the required controls are in place ... even if they actually are not. Aside from the risk of business units simply reporting whatever HQ expects to hear, there is also a distinct possibility that the business units might have misunderstood the requirement, and failed to implement the control effectively (perhaps mis-configuring their security systems).

That brings us to the matter of the nature and extent of control implementation. If a business unit has the required identification and authentication (I&A) mechanism in place for some but not all of their systems, how should they report this? What if they have made a genuine effort to implement it on most systems, but the few that remain are particularly important ones? What if the identification part is working as per the spec but the authentication isn't, perhaps using a different mechanism for valid business or technical reasons? There are several variables here, making it tough to answer honestly a typically naive checklist question such as "Are your IT systems using the proven I&A mechanisms required in the corporate security standards (Y/N)?"

On that basis, the managers gave this metric a PRAGMATIC score of just 44%, held back by abysmal ratings for Genuineness and Independence (see page 207 in PRAGMATIC Security Metrics). 

The metric is not necessarily dead in the water, though, since it would be possible to address their main concerns through some form of independent assessment and reporting of the I&A mechanismsCertifying IT systems is something rarely seen outside large military and governmental organizations, who have the governance structures in place to:
  1. Define security requirements including technical controls such as specified I&A mechanisms, methods, software etc.;
  2. Mandate those requirements on the various business units;
  3. Implement the controls locally, often with central support (e.g. technical support plus standards, procedures and guidelines);
  4. Accredit certification functions who are competent to test and certify business units' compliance with the security requirements;
  5. Test and certify the business units, and re-test and re-certify them periodically;
  6. Deal with any noncompliance.
That little lot would generally be viewed as an expensive luxury for most organizations (impacting the metric's Cost-effectiveness rating), although the global spread of ISO/IEC 27001 certification is gradually assembling most of those pieces, and making more organizations familiar with the concept of accredited certification.

Meanwhile, ACME quietly parked this metric in the "too hard for now" bucket, pressing ahead with the higher-scoring metrics still on their shortlist.

* PS Unless someone present happens to notice and point out assumptions like this, they tend to remain unspoken, and are a frequent cause of misunderstandings. At some stage (perhaps after a PRAGMATIC workshop has shortlisted a reasonably small number of metrics thought worth pursuing), the metrics ought to be specified in sufficient detail to dispel such doubts. Several security metrics standards and websites give examples of the forms typically used to specify metrics, although most appear obsessed with the statistics, often neglecting valuable information such as the reasoning behind and justification for the metrics, the intended audiences and so forth. I'm sure "How should we specify security metrics" would spawn an interesting thread on the Security Metametrics group on Linkedin ...

23 January 2014

SMotW #89: number of infosec events

Security Metric of the Week #89: number of information security events, incidents and disasters

This week, for a change, we're borrowing an analytical technique from the field of quality assurance called "N why's" where N is roughly 5 or more.

Problem statement: for some uncertain reason, someone has proposed that ACME might count and report the number of information security events, incidents and disasters.
  1. Why would ACME want to count their information security events, incidents and disasters?
  2. 'To know how many there have been' is the facile answer, but why would anyone want to know that?
  3. Well, of course they represent failures of the information risk management process. Some are control failures, others arise from unanticipated risks materializing, implying failures in the risk assessment/risk analysis processes. Why did the controls or risk management process fail?
  4. Root cause analysis reveals many reasons, usually, even though a specific causative factor may be identified as the main culprit. Why didn't the related controls and processes compensate for the failure?
  5. We're starting to get somewhere interesting by this point. Some of the specific issues that led to a given situation will be unique, but often there are common factors, things that crop up repeatedly. Why do the same factors recur so often?
  6. The same things keep coming up because we are not solving or fixing them permanently. Why don't we fix them?
  7. Because they are too hard, or because we're not trying hard enough! In other words, counting infosec events, incidents and disasters would help ACME address its long-standing issues in that space.
There's nothing special about that particular sequence of why's nor the questions themselves (asking 'Who?', 'When?', 'How?' and 'What for?' can be just as illuminating), it's just the main track my mind followed on one occasion. For instance, at point 5, I might equally have asked myself "Why are some factors unique?". At point 3, I might have thought that counting infosec incidents would give us a gauge for the size or scale of ACME's infosec issues, begging the question "Why does the size of scale of the infosec issues matter?". N why's is a creative technique for exploring the problem space, digging beneath the superficial level.

The Toyota Production System uses techniques like this to get to the bottom of issues in the factory. The idea is to stabilize and control the process to such an extent that virtually nothing disturbs the smooth flow of the production line or the quality of the final products. It may be easy for someone to spot an issue with a car and correct it on the spot, but it's better if the causes of the issue are identified and corrected so it does not recur, or even better still if it never becomes an issue at all. Systematically applying this mode of thinking to information security goes way beyond what most organizations do at present. When a virus infection occurs, our first priority is to contain and eradicate the virus: how often do we even try figuring out how the virus got in, let alone truly exploring and addressing the seemingly never-ending raft of causative and related factors that led to the breach? Mostly, we don't have the luxury of time to dig deeper because we are already dealing with other incidents.

Looking objectively at the specific metric as originally proposed, ACME managers gave it a PRAGMATIC score of 49%, effectively rejecting it from their shortlist ... but this one definitely has potential. Can PRAGMATIC be used to improve the metric? Obviously, increasing the individual PRAGMATIC ratings will increase the overall PRAGMATIC score since it is simply the mean rating. So, let's look at those ratings (flick to page 223 in the book).

In this case, the zero rating for Actionability stands out a mile. Management evidently felt totally powerless, frustrated and unable to deal with the pure incident count. The number in isolation was almost meaningless to them, and even plotting the metric over time (as shown on the example graph above) would not help much. Can we improve the metric to make their job easier?

As indicated at item 7 above, this metric could help by pointing out how many information security events, incidents and disasters link back to systematic failures that need to be addressed. Admittedly, the bare incident count itself would not give management the information needed to get to that level of analysis, but it's not hard to adapt and extend the metric along those lines, for instance categorizing incidents by size/scale and nature/type, as well as by the primary and perhaps secondary causative factors, or the things that might have prevented them occurring.

A pragmatic approach would be to start assigning incidents to fairly crude or general categories, and in fact this is almost universally done by the Help Desk-type functions that normally receive and log incident reports - therefore the additional information is probably already available from the Help Desk ticketing system. Management noting a preponderance of, say, malware incidents, or an adverse trend in the rate of incidents stemming from user errors, would be the trigger to find out what's going wrong in those areas. Over time, the metric could become more sophisticated with more detailed categorization etc.

20 January 2014

7 things you should know about infosec metrics

A new two-page Educause paper by Shirley C. Payne from the University of Virginia and Stephen A. Vieira from the Community College of Rhode Island succinctly explains the purpose and utility of information security metrics.
"An information security metric is an ongoing collection of measurements to assess security performance, based on data collected from various sources. Information security metrics measure a security program’s implementation, effectiveness, and impact, enabling the assessment of security programs and justifying improvements to those programs. Effective metrics can bring visibility and awareness to the underlying issue of information security and highlight effective efforts through benchmarking, evaluation, and assessment of quantified data. This can put institutions in a proactive stance regarding information security and demonstrate support for leadership’s priorities."

Although written for educational institutions, the principles are universally applicable to any organization that secures information.

By referring specifically to IT security and the IT function, the paper introduces a subtle bias towards technical metrics. Personally, I would have emphasized using enterprise and information security strategies rather than IT to drive the selection of metrics - but that's a small quibble with an otherwise well-written paper.

16 January 2014

SMotW #88: security ascendancy

Security Metric of the Week #88: information security ascendancy level

One of the most frequent complaints from information security professionals is that they don't get sufficient management support. They say that management doesn't take information security seriously enough, relative to other corporate functions. But are they right to complain, or are they just whining?

There are several possible metrics in this space, for example:
  • Survey management attitudes towards information security, relative to other concerns;
  • Compare the information security budget (revenue and capital charges) against other functions;
  • Assess the maturity of the organization's governance of information security;
  • Measure the level of the most senior manager responsible for information security ("security ascendancy").
The last of these is the simplest and easiest to measure. On the organogram above, the organization presumably scores 2 since it has a Chief Information Security Officer who reports directly to the Chief Executive Officer, the most senior manager in the firm. However, if the CEO takes a personal and direct interest in information security, the score might reach 1 (perhaps depending on whether information security is formally acknowledged as part of the CEO's role in his role description).

The power and influence of the function across the organization decreases with each additional layer of management between it and the CEO. If it is down at level 4 or 5, buried out of sight in the depths of IT (as is often the way), its influence is largely constrained to IT, meaning that it is essentially an IT security rather than information security function. However, since IT typically pervades the business, that is not necessarily the end of the world: with competent and dedicated professionals on board, the Information Security function can still build a strong social network, prove its worth, and influence colleagues by informing and persuading them rather than using positional power. Sure it's hard work, but it's possible.

ACME scored this metric highly at 85% on the PRAGMATIC scale (see the book for the detailed score breakdown). It was welcomed as a strategic metric that directly supported ACME's strategy to improve the organization's focus on information security, one that had value in the short to medium term (i.e. not necessarily a permanent security metric).

08 January 2014

SMotW #87: visitor/employee parking separation

Security Metric of the Week #87: distance separating employee from visitor parking

Imagine your corporate security standards require that "Employee parking spaces must be physically distant from visitor parking spaces, separated by at least 100 paces". The rule might have been introduced in order to reduce risks such as employees covertly passing information to visitors between vehicles, or terrorists triggering vehicle bombs in the vicinity of key employees, or for some other reason (to be honest, we're not exactly sure of the basis - a common situation with big corporations and their thick rulebooks: the rationale often gets lost or forgotten in the mysts of time). Imagine also that senior management has determined that the security standards are important, hence compliance with the standards must be measured and reported across the corporation. Forthwith! 

Now picture yourself in the metrics workshop where someone proposes this very metric. They painstakingly point out the specific rule in the rulebook, noting that the distance between employee and visitor parking is something that can be measured easily on the site plans, or paced out in the parking lot. As far as they are concerned, this metric fits the bill. It is cheap, elegant even, hard to fake and easily verified. "If HQ wants compliance metrics, compliance metrics is what they'll jolly well get!"

It soon becomes abundantly clear that the proposer has ulterior motives. Rather than proactively supporting HQ, his cunning plan is to undermine the effort through passive resistance. A metric that technically fulfills the requirement while providing no useful information would be perfect!

As the group tries ever harder to dismiss the metric, so the proposer digs-in deeper until he is fully entrenched. By this stage, it is definitely "his" metric: he takes any hint of criticism personally, and seemingly has an answer for everything. Tempers fray as the heat exceeds the light output from the discussion.

PRAGMATIC to the rescue! In an attempt to defuse the situation, someone suggests working through the method and scoring the metric as a team effort. Dispassionately considering the PRAGMATIC criteria one by one, and allowing for the metric's plus points, leads to a final score of just 41% ... and a big thumbs-down for this metric.

Measuring health risks

I think it's fair to say that metrics is a "challenging" topic across all fields, not just information security. The issues are not so much with the actual mathematics and statistics (although it is all too easy for non-experts like me to make fundamental mistakes in that area!) as with what to measure, why it is being measured, and how best to measure, report and interpret/use the information.

As a reformed geneticist, here's an example I can relate to: measuring and reporting health risks resulting from off-the-shelf DNA test kits. A journalist for the New York Times took three different tests and compared the results. 

Underlying the whole piece is the fact that we're talking about risks or probabilities, with inherent uncertainties. The journalist identified several factors with these tests that make things even less certain for customers.

For a start, the three test companies appear to be testing for their own unique batteries of disease markers, which immediately introduces a significant margin for error or at least differences between them. To be honest, I'm not even entirely certain that all their markers are valid. I don't know how they (meaning both the markers and the companies) are assessed, nor to what extent either of them can be trusted.

Secondly, the test results were reported relative to 'average' incident rates for each disease, using different averages (presumably separate data sets, quite possibly means of samples from entirely different populations!). This style of metric reporting introduces the problem of 'anchoring bias': the average numbers prime the customers to interpret the test results in a certain way, perhaps inappropriately.

Thirdly, except in a few specific situations, our genes don't directly, indisputably cause particular diseases: most of those disease markers are correlated to some extent with a preponderance to the disease, rather than being directly causative. If I have a marker for heart disease, I may be more likely to suffer angina or a heart attack than if I lacked the marker, but just how much more likely is an open question since it also depends on several other factors, such as whether I smoke, over-eat or am generally unfit - and some of those factors, and more besides, are themselves genetically-related. There are presumably genetic 'health markers' as well as 'disease markers', so someone with the former might be less prone to the latter.

A fourth factor barely noted in the NY Times piece concerns the way the results are reported. In a conventional clinical setting, diagnostic test results are interpreted by specialists who truly understand the tests, the natural variation between people, and the implications of the results, given the context of the actual patient (particularly the presence/absence, nature and severity of other symptoms and contributory factors). The written lab test reports may highlight specific values that are considered outside the normal range, but what those numbers actually mean for the patient is left to the specialists to determine and explain. In cutting out the specialists, the off-the-shelf test kit companies are left giving their customers general advice, no doubt couched very carefully in terms that avoid any liability for mistakes. On top of that, they have a responsibility to avoid over- and under-playing the risks, implying a neutral bias. In the doctor's surgery, the doc can respond to your reactions, give you a moment to let things sink in, and offer additional advice beyond the actual test results. That interaction is missing if you simply get a letter in the mail. 

There's a fifth factor that isn't even mentioned in the report, namely that the samples and tests themselves vary somewhat. It's a shame the reporter didn't take and submit separate samples to the same labs (perhaps under pseudonyms) to test their repeatability and inherent quality.

The final comments in the NY Times are right on the mark. Instead of spending a couple of hundred dollars on these tests, buy a decent set of bathroom scales and assess the more significant health risks yourself! While I have a lot of respect for those who develop sophisticated information security risk models and systems, I'm inclined to say much the same thing. An experienced infosec or IT audit pro can often spot an organization's significant risk factors a mile off, without the painstaking risk analysis. 

03 January 2014

SMotW #86: info asset inventory integrity

Security Metric of the Week #86: integrity of the information asset inventory

As a general rule, if you are supposed to be securing or protecting something, it's quite useful to know at least roughly what that 'something' is ...

Compiling a decent list, inventory or database of information assets turns out to be quite a lot harder than one might think.  Most organizations made a stab at this for Y2K, but enormous though it was, that effort was very much focused on IT systems and, to some extent, computer data, while other forms of information (such as "knowledge") were largely ignored. 

Did your organization even maintain its Y2k database?  Hardly any did.

If we were able to assess, measure and report the completeness, accuracy and currency of the information asset inventory, we could provide some assurance that the inventory was being well managed and maintained - or at least that the figures are headed the right way.  

How would one actually generate the measurements? One way would be to validate a sample of records in the inventory against the corresponding assets, or vice versa (perhaps both).  A cunning plan to validate, say, the next 10% of the entries in the inventory every month would mean that the entire data set would be validated every year or so (allowing for changes during the year, including perhaps the introduction of additional categories of information asset that were not originally included). 


ACME management were quite interested in this metric, if a little concerned at the Accuracy, Timeliness and Integrity of the metric (ironic really!).  Having calculated the metric's PRAGMATIC score, they decided to put this one on the pending pile to revisit later.

The CISO was more confident than his peers that his people would compile the metric properly, and he toyed with the idea of either using the metric for his own purposes, or perhaps proposing a compromise: Internal Audit might be commissioned to sample and test the inventory on a totally independent basis, comparing their findings against those from Information Security to prove whether Information Security could be trusted to report this and indeed other security metrics.