|
|
CVE ID 10k Problem Feedback nCircle would like to push for Option 7. [Special thanks to my engineers Ian Turner and Bob Thomas for feedback and review.] I don’t think it’s possible to change the format in any way without breaking someone’s implementation somewhere. “CVE-\d{4}-\d{4}” might match, but “^CVE-\d{4}-\d{4}$” might not – or “/CVE-(?P<year>\d{4})-(?P<id>\d{4})/$”,
etc. Overall, I think either
Option 1 or Option 7 are the best. I think they match up against the considerations thusly:
-
Both have a large ID space
-
Both are easily recognizable as CVEs
-
Option 1 is probably more easily adopted than Option 7 as Option 7 has an additional field and the validation aspect.
-
Option 1 is probably simpler to adopt in terms of updating software. Option 1 is likely already matched by many existing regular expressions in use for locating CVEs.
Option 7 can be matched by a regular expression but the validation would need to occur separately.
-
Option 1 delays impact longer than Option 7
-
Option 7 is immediately recognized as the new syntax. However, Option 1 is also easily recognized as the new syntax once it matters that the syntax is different.
-
As long as the number of digits is not limited both appear to be fairly future-proof.
Option 7 below looks the best to me. I would expect most implementations to ignore the check digit, but it’s nice to have, and makes it easy to identify the newer format. It could probably be improved
by padding fewer than 4 digits with 0s, like suggested for Option 1 (for compatibility). My concern with something like Option 1 is that nobody will notice that there is a new standard until CVE-2013-10000 is released, and a bunch of implementations suddenly
break. Options 5 and 6 are probably the worst options. They’ll break existing implementations in a bad way (anything trying to parse ints). Option 8 loses the year completely – I think a lot of people would take issue with that. Also, breaks most existing implementations. More Details: Option 1: Year + arbitrary digits, no leading 0's -------------------------------------- Examples: CVE-2013-1234, CVE-2013-12345 (if 4 digits or less, leading 0's would be used, e.g. CVE-2013-0056 instead of CVE-2013-56) nCircle Feedback: I think this is the logical extension of the current ID format. It seems obvious to me that CVE-2013-10000 would be the next ID issued after CVE-2013-9999. Since the fields
are dash-delimited the ID space is basically unlimited. The only downside appears to be that standard alphanumeric sorting doesn't place 10000 after 9999 but this is an issue that comes up many times in software engineering and I consider it to be a basically
solved problem. I don't think this format should be rejected just because of lazy coders. Option 2: Year + 5 digits, leading 0's -------------------------------------- Examples: CVE-2013-01234, CVE-2013-56789 nCircle Feedback: I don't like specifying the maximum number of ID digits because it's limiting and just because 1,000,000 seems like a big number for identified vulnerabilities now doesn't
mean it will remain a big number in the future. I also don't care for leading zeros, which these two options require. Option 3: Year + 6 digits, leading 0's -------------------------------------- Example: CVE-2013-012345, CVE-2013-678901 nCircle Feedback: Options 2 and 3 don’t seem very useful. They will immediately break some existing implementations, and just set another arbitrary limit. Option 4: Non-standard year + 4 digits -------------------------------------- Example: CVE-1013-1234, CVE-3013-1234 (instead of using the current year, a year before 1999 or after 3000 could be used once 10,000 is reached in a single year) nCircle Feedback: I find that having the year in the ID is extremely helpful and since this scheme obfuscates the year once 10,000 bugs are reached I don't care for it. This also breaks
standard alphanumeric sorting because CVE-3013-0001 will end up after CVE-2014-0001. Sorting can be fixed but this is a less standard sorting problem than the one in Option 1. Option 4 would break sorting, and just be very confusing in general. The “year” part effectively becomes useless. Option 5: year + 4 hex digits ----------------------------- Example: CVE-2013-A9E4 nCircle Feedback: I think this is an ok option but as long as the format is going to change I don't think the number of digits should be limited. Since the last part is still a numerical
value it's easier to sort than Option 6. Option 6: year + 4 alphanumeric ------------------------------- Example: CVE-2013-ZW1K nCircle Feedback: I don't care for this option because it moves away from numerical values in commonly used bases. I guess the value could be treated as base 36 but I'd rather go with Option 5 than Option 6, especially
if the number of digits wasn't limited to 4. Option 7: CCE-Style (year + arbitrary digits + check digit) ----------------------------------------------------------- Example: CVE-2013-12345-6 (the "6" is a check digit, following the Luhn Check Digit Algorithm) nCircle Feedback: I think this is the best option. I like that the new format can be immediately recognized because it contains three dashes instead of two. As with Option 1, the sorting
will need to be handled to make sure that adjacent CVEs appear next to each other in lists but as I've said before that should not be a problem. Option 8: CERT-VU/JVN Style --------------------------- Example: CVE#12345678 nCircle Feedback: I think this format is unacceptable because the year is omitted. As I said before, I find having the year in the ID to be extremely useful. -- Tim "TK" Keanini, Chief Research Officer … nCircle Inc. … mbl (415) 328-2722 … Blog: patterns.ncircle.com Twitter: @tkeanini From: owner-cve-editorial-board-list@LISTS.MITRE.ORG [mailto:owner-cve-editorial-board-list@LISTS.MITRE.ORG]
On Behalf Of Christey, Steven M. All, As discussed in the Editorial Board teleconference on October 31, it is time to update the CVE ID syntax so that CVE can support more than 10,000 identifiers in a single year (the "CVE-10K Problem"). Some discussion was held on the Editorial Board list back in 2007, but no official decision was reached or implemented (see http://cve.mitre.org/data/board/archives/2007-01/threads.html). However, it is now time to resolve this issue. It sometimes causes distractions in the larger Global Vulnerability Reporting (GVR) discussions, and it is possible that MITRE could generate more than 10,000 identifiers in 2013 due to additional staff and recent process/infrastructure improvements. It will take many consumers 6 months or more to fully adopt any new syntax, so sufficient warning is required, and we will need to develop a transition plan for the new syntax. We will iron out the details later. Below are several options for ID syntaxes. Please respond to the list with your opinions about: * suggestions for a different syntax? * which syntax would be best for consumers? * which syntax would be easiest/cheapest to adopt? We are not holding any formal vote at this time. Considerations for a Good Syntax -------------------------------- For a new CVE ID syntax to be "good," it should have most (or all) of the following properties: 1) Large ID space, i.e., a large number of potential IDs that could be assigned. 2) Usability by consumers - such as the ability to recognize that the ID is a CVE number, and to reduce confusion. 3) Ease of adoption by consumers. 4) Low maintenance/adoption costs for both consumers and providers. For example, some ID schemes could reduce memory/disk overhead, or it could be easy to extend regular expressions that are currently used to detect or manage CVD IDs. 5) Delayed impact of the syntax change - since any change will have many unexpected effects on downstream consumers, we want to give people as much time to adjust to the new syntax as possible. So, it may be favorable to use a syntax that doesn't appear to change until 10,000 identifiers are needed. 6) Syntax version recognition - if possible, it should be clear to the consumer or an automated system as to which syntax version is being used for an ID - the old syntax, or the new syntax. For example, ISBN numbers were originally 10 digits long, then expanded to 13 digits - so the length of the ISBN clarifies which version is being used. 7) Future-proofing - if possible, the ID scheme could be flexible enough that future requirements do not force additional changes. Option 1: Year + arbitrary digits, no leading 0's ------------------------------------------------- Examples: CVE-2013-1234, CVE-2013-12345 (if 4 digits or less, leading 0's would be used, e.g. CVE-2013-0056 instead of CVE-2013-56) Impact: delayed until MITRE produces 10K IDs in a year Number of IDs: 1,000,000 per year (assuming 6 digits) Note: alphabetic sorting would not list CVE-2013-9999 and CVE-2013--10000 next to each other. Option 2: Year + 5 digits, leading 0's -------------------------------------- Examples: CVE-2013-01234, CVE-2013-56789 Impact: immediate upon adoption (many existing parsers might misinterpret "CVE-2013-01234" as "CVE-2013-0123" by assuming only 4 digits). Number of IDs: 100,000 per year Notes: this might not have extensive impact for some adopters, e.g. some liberal regular expressions such as (CVE-\d+-\d+) would work for both old and new syntax. Option 3: Year + 6 digits, leading 0's -------------------------------------- Example: CVE-2013-012345, CVE-2013-678901 Impact: immediate upon adoption (many existing parsers might misinterpret "CVE-2013-012345" as "CVE-2013-0123" by assuming only 4 digits). Number of IDs: 1,000,000 per year Notes: this might not have extensive impact for some adopters. For example, some liberal regular expressions such as (CVE-\d+-\d+) would work for both old and new syntax. Option 4: Non-standard year + 4 digits -------------------------------------- Example: CVE-1013-1234, CVE-3013-1234 (instead of using the current year, a year before 1999 or after 3000 could be used once 10,000 is reached in a single year) Impact: delayed Number of IDs: 100,000,000 overall Notes: could cause user confusion (strange-looking IDs that are assumed to be typos), but might have reduced adoption costs (many current regular expressions would still work, ID is the same length as the original syntax). Option 5: year + 4 hex digits ----------------------------- Example: CVE-2013-A9E4 Impact: delayed Number of IDs: 65,535 per year Notes: could cause user confusion (strange-looking IDs that are assumed to be typos), but might have reduced adoption costs (length is the same), although it would break regular expressions that assume only digits. Option 6: year + 4 alphanumeric ------------------------------- Example: CVE-2013-ZW1K Impact: delayed Number of IDs: 1.7 million per year Notes: could cause user confusion (strange-looking IDs that are assumed to be typos), but might have reduced adoption costs (length is the same), although it would break regular expressions that assume only digits. Option 7: CCE-Style (year + arbitrary digits + check digit) ----------------------------------------------------------- Example: CVE-2013-12345-6 (the "6" is a check digit, following the Luhn Check Digit Algorithm) Impact: immediate Number of IDs: Effectively unlimited Notes: the check digit provides an easy way to automatically spot typos in IDs, which happens in CVE multiple times per year (for example, sometimes vendors release advisories with typos in the CVE IDs). The number of digits between the year and the check digit is effectively unlimited, since the boundaries are marked by hyphens. There may be user confusion, and people might leave out the check digit entirely. Option 8: CERT-VU/JVN Style --------------------------- Example: CVE#12345678 Impact: immediate Number of IDs: 100,000,000 overall Notes: this ID does not encode the year, which many consumers like, since it is an approximation of when the issue became public. |