Opened 15 years ago

Closed 15 years ago

#2685 enhancement closed fixed (fixed)

factor secure random implementation

Reported by: Antoine Pitrou Owned by:
Priority: highest Milestone:
Component: core Keywords:
Cc: therve, itamarst, z3p, Jean-Paul Calderone Branch:
Author:

Description

Several places are using their own secure random implementation - relying on PyCrypto with a fallback on non-secure random: t.names.dns, t.conch.ssh.common. Also the branch in #2015 has such an implementation.

We should centralize a secure random implementation which can be used by any part of twisted, and applications as well. It will rely on PyCrypto but be able to fallback on os.urandom and then on non-secure random methods.

Attachments (1)

secure-random.patch (7.8 KB) - added by Antoine Pitrou 15 years ago.

Download all attachments as: .zip

Change History (45)

Changed 15 years ago by Antoine Pitrou

Attachment: secure-random.patch added

comment:1 Changed 15 years ago by Antoine Pitrou

Attached patch adds a twisted.python.secrandom module with associated tests, and converts both t.names.dns and t.conch.ssh.common to use the new module.

comment:2 Changed 15 years ago by therve

Owner: changed from Glyph to therve

comment:3 Changed 15 years ago by therve

Cc: therve added
Keywords: review added
Owner: therve deleted
Priority: normalhighest

Dear buildbot is green After a commit and some cleanups Please review me at secure-random-2685

comment:4 Changed 15 years ago by itamarst

For one thing, conch.common should use secrandom API directly.

comment:5 Changed 15 years ago by therve

Keywords: review removed
Owner: set to therve

OK, but we have to keep common.entropy for backward compatibility.

comment:6 Changed 15 years ago by therve

Keywords: review added
Owner: therve deleted

I used secrandom in conch, and add a deprecation warning for entropy.

comment:7 Changed 15 years ago by itamarst

Cc: itamarst added

More comments:

  1. You probably want os.urandom by default, not pycrypto, which is I think more determinisitic (but check if I'm correct).
  1. ckeygen really ought to use /dev/random if at all possible.

comment:8 Changed 15 years ago by itamarst

Cc: z3p added

comment:9 Changed 15 years ago by itamarst

Any comments, Paul?

comment:10 in reply to:  7 Changed 15 years ago by Antoine Pitrou

Replying to itamarst:

More comments:

  1. You probably want os.urandom by default, not pycrypto, which is I think more determinisitic (but check if I'm correct).

PyCrypto is different. It initializes itself with /dev/urandom at start (if possible), and then regularly disturbs (using timestamps) and scrambles (using hashes) its internal entropy pool so as to make prediction of values impossible. I'd say it looks at least as robust as os.urandom, because of the scrambling thing (PyCrypto claims to use the same algorithm as PGP). Also, it is possible to call randomPool.randomize() periodically so as to force reinitializing the pool using /dev/urandom.

  1. ckeygen really ought to use /dev/random if at all possible.

I don't have any opinion on this one :-)

comment:11 Changed 15 years ago by z3p

I don't like that it's called secureRandom if it will fall back nearly silently to non-random data. I'd be okay if it had a mode where it failed to work if cryptographically secure randomness isn't available. Maybe twisted.python.random, with a secureRandom and random methods, where secureRandom would fail and random would fallback?

I'm also okay with ckeygen using PyCrypto, for the same reasons as antoine. I just don't want it silently failing to actually generate random keys :)

comment:12 Changed 15 years ago by jknight

It seems to me that it's quite unlikely that pycrypto's randomness would be of higher quality than /dev/urandom, given that the later has many more possible entropy inputs. The only reason to use PyCrypto AFAICT is if your OS is deficient in providing a source of randomness.

I don't know how os.urandom is implemented in Windows, of course.

As I understand it, long-term key generation should be done using /dev/random, and session keys and generic cryptographic random number needs should be fulfilled with /dev/urandom.

comment:13 Changed 15 years ago by therve

Please note that the original goal of the ticket is to factor existing work into a unique place. This module actually does the same thing as before, with some improvements I guess.

pycrypto pool is for now the best default as it works with Python 2.3, where os.urandom is not present. We can't use /dev/random because AFAIK reading from it can block. However we could add /dev/urandom reading for Python2.3

The only thing where I agree is whether we should have a fallback. I'm tempted to remove it, as it's only for 2.3 without pycrypto installed, and we don't even have a build slave like that :).

comment:14 Changed 15 years ago by itamarst

  1. Use os.urandom by default, fall back to Pycrypto.
  1. /dev/random should be used for key *generation*, ckeygen, by default, and it's fine there since you can just ask user to generate entropy (that's how ssh-keygen works I'm sure.)

comment:15 Changed 15 years ago by Antoine Pitrou

Hi,

I don't agree that os.urandom is better than Pycrypto. If you want the best non-blocking security, it should actually be better to call randomPool.randomize() in secureRandom(), because not only does randomize() get new entropy from /dev/urandom (or the Windows equivalent), but it also scrambles it which should make it more secure. /dev/urandom cannot be trusted to be cryptographically secure at all times.

However, as therve said, the point of the patch is a simple refactoring anyway. Therefore, the patch doesn't break the current behaviour that Pycrypto is preferred if available, it just adds a fallback on os.urandom which is likely better than Python's builtin random module.

comment:16 Changed 15 years ago by jknight

os.urandom of course also mixes its pool. There is no reason to believe that, if no real randomness is available, the deterministic scrambling of Pycrypto is somehow better than the deterministic scrambling of the kernel.

comment:17 Changed 15 years ago by therve

I set the default to os.urandom, and followed Paul's advices. The last problem remaining is the warning printed, but I really like #2626 for cleaning that.

comment:18 Changed 15 years ago by Antoine Pitrou

I don't understand why you renamed secrandom to random. There is now an ugly import dance at the beginning of t.p.random in order to import the builtin random (and every other module in twisted.python will have to do the same). Surely there must be another name which doesn't conflict with a builtin module (e.g. "randbytes") ;-)

The other changes are ok with me.

comment:19 Changed 15 years ago by therve

I renamed to random because it doesn't contain only a secure implementation now, but we can probably find a better name. I'll go with randbytes if there is no other proposition.

comment:20 Changed 15 years ago by Glyph

I don't know if this is actually relevant to the discussion, but the repeated declaration that /dev/random is "blocking" seems slightly misleading. If you open('/dev/random') and then do twisted.internet.fdesc.setNonBlocking on the result, you get a source of randomness which can be added to the reactor and read as desired.

That might still not be fast enough, or you might not want to have to require someone to jiggle the mouse on your rackmounted server in order for some Deferred crypto-key-generation function to complete, but it is a possibility for applications that actually desire it.

This, coupled with the performance issues with urandom discussed here, suggests to me that generation of randomness should not be regarded as a synchronous operation, regardless of its performance.

comment:21 Changed 15 years ago by therve

To clarify my points against /dev/random:

  • it's not portable. We'd have to find a solution for win32, and the behavior on other platforms can differ
  • it's dangerous: if presented as the 'super secure solution', users would be tempted to use it everywhere, and encounter the problem you mentioned ("my request is waiting forever")
  • it's hard to test, for all the reasons above
  • last but not least, it's out of scope for this ticket

Your point about making secrandom asynchronous may be relevant. I don't think urandom is so slow that's required, but who knows. Something with deferToThread may do the trick, but it will have a strong impact on all the uses within twisted.

comment:22 in reply to:  20 ; Changed 15 years ago by Antoine Pitrou

Replying to glyph:

This, coupled with the performance issues with urandom discussed here, suggests to me that generation of randomness should not be regarded as a synchronous operation, regardless of its performance.

Well, let's put it more precisely: os.urandom is not annoyingly slow. It just can be an order a magnitude slower than the purely algorithmic functions in the builtin random module. For example (Mandriva system with Python 2.5.1):

>>> timeit.Timer("random.getrandbits(1024*8)", "import random").timeit(1)
4.100799560546875e-05
>>> timeit.Timer("os.urandom(1024)", "import os").timeit(1)
0.00067996978759765625

As you see, even if you need 1024 bytes of randomness at once, os.urandom will take less than 1 millisecond.

On the other hand, yes, /dev/random could be queried with an asynchronous mechanism (perhaps with an optional timeout which would fallback on /dev/urandom for the remaining entropy bytes if not enough were collected). But I think that's beyond the scope of this ticket.

Also, there are situations where you want fast generation of randomness immune to attacks (sequence numbers, unique tags... in generic-sasl-2015, we ask 20 bytes of randomness each time a digest nonce or cnonce needs to be created). In those situations waiting even one second for /dev/random to come back does not seem appropriate.

comment:23 in reply to:  22 ; Changed 15 years ago by jknight

Replying to antoine:

Replying to glyph:

This, coupled with the performance issues with urandom discussed here, suggests to me that generation of randomness should not be regarded as a synchronous operation, regardless of its performance.

Well, let's put it more precisely: os.urandom is not annoyingly slow. It just can be an order a magnitude slower than the purely algorithmic functions in the builtin random module. For example (Mandriva system with Python 2.5.1):

Where were performance issues discussed with urandom? I didn't see any discussion of performance issues before glyph's comment. urandom only performs slightly worse than reading a disk, and you shouldn't ever be reading so much random data that this actually matters. Plus, it uses 100% CPU while working.

On the other hand, yes, /dev/random could be queried with an asynchronous mechanism (perhaps with an optional timeout which would fallback on /dev/urandom for the remaining entropy bytes if not enough were collected). But I think that's beyond the scope of this ticket.

Not only is it beyond the scope of this ticket, the parenthetical comment is a really bad idea. If you don't need /dev/random (normal case), you just shouldn't use it at all. If you do need it, you don't want to get potentially not random data mixed in just because /dev/random took "too long".

Also, there are situations where you want fast generation of randomness immune to attacks (sequence numbers, unique tags... in generic-sasl-2015, we ask 20 bytes of randomness each time a digest nonce or cnonce needs to be created). In those situations waiting even one second for /dev/random to come back does not seem appropriate.

Yes, /dev/random should never be used for this kind of application. It should only be used in specialized cases, like where you're generating a long-term secure key (e.g. ssh-keygen).

Given that, we probably don't even need to bother implementing support for it at all in a generic fashion, never mind as part of this ticket (where it *clearly* isn't appropriate).

comment:24 in reply to:  23 Changed 15 years ago by Glyph

Replying to jknight:

Replying to antoine:

Replying to glyph:

This, coupled with the performance issues with urandom discussed here, suggests to me that generation of randomness should not be regarded as a synchronous operation, regardless of its performance.

Where were performance issues discussed with urandom? I didn't see any discussion of performance issues before glyph's comment. urandom only performs slightly worse than reading a disk, and you shouldn't ever be reading so much random data that this actually matters. Plus, it uses 100% CPU while working.

Sorry; "discussed here" was a brain fart. There's a comment in the patch, not a discussion on the ticket.

On the other hand, yes, /dev/random could be queried with an asynchronous mechanism (perhaps with an optional timeout which would fallback on /dev/urandom for the remaining entropy bytes if not enough were collected). But I think that's beyond the scope of this ticket.

Not only is it beyond the scope of this ticket, the parenthetical comment is a really bad idea. If you don't need /dev/random (normal case), you just shouldn't use it at all. If you do need it, you don't want to get potentially not random data mixed in just because /dev/random took "too long".

I agree completely.

The only reason I even brought this up is that the selection of a source of randomness for a middle range of tasks is a configuration decision for an administrator, not the application author. That means it should (in those cases) be treated as Deferred, even if it uses the faster path by default. That is definitely out of scope to require for this ticket, but the two comments about "blocking" made it seem as though it was being discarded from Twisted entirely.

Also, there are situations where you want fast generation of randomness immune to attacks (sequence numbers, unique tags... in generic-sasl-2015, we ask 20 bytes of randomness each time a digest nonce or cnonce needs to be created). In those situations waiting even one second for /dev/random to come back does not seem appropriate.

Yes, /dev/random should never be used for this kind of application. It should only be used in specialized cases, like where you're generating a long-term secure key (e.g. ssh-keygen).

Given that, we probably don't even need to bother implementing support for it at all in a generic fashion, never mind as part of this ticket (where it *clearly* isn't appropriate).

Again, complete agreement. I want to be very clear that my comment should not in any way be construed as delaying a merge or repurposing this ticket.

comment:25 Changed 15 years ago by therve

I merged forward to secure-random-2685-2 to use the new assertWarns method, and rename the module to randbytes.

comment:26 Changed 15 years ago by Jean-Paul Calderone

For posterity, though this point seems to have been correctly dropped from relevant conversation about this ticket (emphasis mine):

A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a crypto‐graphic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.

Removing theoretical vulnerabilities is certainly valuable and desirable, but let's not make the mistake of thinking that all we have to do is use /dev/random in order to make portions of Twisted bulletproof against attack (or even removes the most obvious vulnerability).

http://www.faqs.org/rfcs/rfc1750.html is worth reading for anyone who wants to contribute seriously to decisions about security-sensitive entropy.

comment:27 Changed 15 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: set to therve

Alright, a review:

  • Doesn't look like there's a test for the deprecation emitted by Entropy.get_bytes.
  • Since there's a branch outstanding that makes a lot of changes to keys.py, it probably would be better to avoid doing coding standard cleanups over the whole module. If you want to leave these changes in this branch, that's fine (probably won't take any longer to resolve any conflicts that might come up than it will take to remove the changes), but this is something to keep in mind for future branches.
  • Hmm. Same comment for transport.py. On the other hand, maybe resolving the conflicts will take longer. ;)
  • randomSource in dns.py should have a docstring
  • in randbytes.py, secureRandom should probably be a method on a class so that it doesn't need to munge a global. There can still be a module-level secureRandom, but the tests should mostly focus on testing an instance of this class they create, rather than anything global. This will make the tests more deterministic in what they're testing since it avoids them sharing global state with each other and, more importantly, with any other tests which happen to use randbytes.py. This might simplify some of the tests as well, since global state won't have to be monkey-patched and then restored.
  • I guess the assertWarns usage will be broken when the other assertWarns branch is merged. Something to keep an eye out for.
  • Should the stacklevel to the warning really be 1? Why not the traditional 2 to try to put it at the caller of secureRandom()?
  • _check wouldn't need to be defined twice if it took the random function to call as an argument.

Not sure this is a complete review, it'd probably be good to have another set of eyes on this code after the above comments are addressed.

comment:28 Changed 15 years ago by therve

Keywords: review added
Owner: therve deleted

I think I've addressed all your comments in secure-random-2685-3.

comment:29 in reply to:  28 Changed 15 years ago by Antoine Pitrou

Replying to therve:

I think I've addressed all your comments in secure-random-2685-3.

It's fine with me.

comment:30 Changed 15 years ago by radix

Keywords: review removed
Owner: set to z3p

[1] The removal of the reactor.stop call from ckeygen looks unrelated. Can you either explain this or move it to a different branch?

[2] class SecureRandomWithoutUrandomTestCase has a typo in the docstring: it should be os.urandom not os.random.

[3] I suggest a new strategy for implementation and testing of the randbytes module: Give each random source a simple callable interface which either returns some random bytes or raises a consistent error. Test each of these independently. Create an object which takes a list of these random sources and runs through them until it finds one that works. Test this independently. It will probably need to special case the insecure random generator, which should be parameterized, in order to know when to write the warning. This will allow you to get rid of all the confusing fallback-case tests.

[4] I suggest removing the confusing "import random" plus the "random = factory.random". Itamar suggests that the exposed function should be called "insecureRandom" instead of "random".

[5] RandomFactory.random is really weird. Why are you using getrandbits instead of returning a string made of a bunch of chr(random.randrange(0, 255))? If there *is* a good reason for using getrandbits, why are you hexing the result? This is crazy dude.

[6] I also suggest naming RandomFactory.random to RandomFactory.insecureRandom.

comment:31 in reply to:  30 Changed 15 years ago by therve

Keywords: review added
Owner: changed from z3p to radix

Replying to radix:

[1] The removal of the reactor.stop call from ckeygen looks unrelated. Can you either explain this or move it to a different branch?

I moved it.

[2] class SecureRandomWithoutUrandomTestCase has a typo in the docstring: it should be os.urandom not os.random.

Done.

[3] I suggest a new strategy for implementation and testing of the randbytes module: Give each random source a simple callable interface which either returns some random bytes or raises a consistent error. Test each of these independently. Create an object which takes a list of these random sources and runs through them until it finds one that works. Test this independently. It will probably need to special case the insecure random generator, which should be parameterized, in order to know when to write the warning. This will allow you to get rid of all the confusing fallback-case tests.

OK I think it's better like this.

[4] I suggest removing the confusing "import random" plus the "random = factory.random". Itamar suggests that the exposed function should be called "insecureRandom" instead of "random".

Done.

[5] RandomFactory.random is really weird. Why are you using getrandbits instead of returning a string made of a bunch of chr(random.randrange(0, 255))? If there *is* a good reason for using getrandbits, why are you hexing the result? This is crazy dude.

getrandbits is much faster than randrange. I clean up the randrange case though.

[6] I also suggest naming RandomFactory.random to RandomFactory.insecureRandom.

Done.

comment:32 in reply to:  30 Changed 15 years ago by Antoine Pitrou

Replying to radix:

[3] I suggest a new strategy for implementation and testing of the randbytes module:[...]

Why not, but let's not over-engineer it... It's just an utility module. therve's latest patch is fine to me (once again :-)), although the list of secure/insecure random methods should probably be a class attribute of the factory rather than hard-coded into the corresponding methods.

comment:33 Changed 15 years ago by Jean-Paul Calderone

Cc: Jean-Paul Calderone added
Keywords: review removed
Owner: changed from radix to therve

I adjusted a few docstrings. I should probably do a few more, but I got tangled up in trying to re-arrange the tests, too. I don't think there's enough direct coverage of the various underlying random sources, particularly the available/not-available codepaths (there are a lot of reasons things can be unavailable, and the tests are a little to cavalier in their treatment of some of these). I might come back to this later, but if someone else wants to tackle it that's cool too.

comment:34 in reply to:  33 Changed 15 years ago by therve

Replying to exarkun:

I don't think there's enough direct coverage of the various underlying random sources, particularly the available/not-available codepaths (there are a lot of reasons things can be unavailable, and the tests are a little to cavalier in their treatment of some of these).

I don't understand: are you talking of the unit tests, or the tests done in randbytes itself to check if a source succeeds?

comment:35 Changed 15 years ago by Jean-Paul Calderone

I meant the unit tests. randbytes.py seems to be doing reasonable things.

An example of the kind of thing in the unit tests which makes me uneasy: test_osUrandom and test_fileUrandom catch randbytes.SourceNotAvailable and assume that means the platform is insufficient for the test and so it is supposed to fail that way. This could just mean that the function being tested is buggy, though. test_cryptoRandom is a little better, since it checks that the failure condition it thinks is being triggered actually did get triggered, but it relies on an implementation detail of the randbytes module so it has its own problem :)

I guess I am expressing an idea similar to the one radix expressed earlier (somehow I read antoine's comment about not over-engineering, but then forgot about it while I was doing the review, and was just reminded of it now). I think it's reasonable to take the time to simplify the implementation here, which will result in the tests being simplified, which should help me feel confident that all the available/unavailable cases are being covered and, just as importantly, that whatever maintenance is done to this module in the future won't accidentally introduce a new combination which has no test coverage.

comment:36 Changed 15 years ago by therve

(In [21178]) Add explicit checks on failures in tests.

Refs #2685.

comment:37 Changed 15 years ago by therve

Keywords: review added
Owner: therve deleted

I've tried to address the issues you mentioned above. I begin to be a bit lost about the requirements for this module, so I hope it's going in the right direction.

comment:38 Changed 15 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: set to therve

__all__ in randbytes still refers to random. Also, it's probably okay to drop the blank lines between the top-level statements setting up secureRandom and insecureRandom. They're closer to the lines in a function than to unrelated top-level suites like class and function definitions, so the intent of the coding standard (if anyone ever updates it ;) isn't to require two blank lines between each line.

The tests are looking better. Now that I examine the cases in more detail though, I notice that SecureRandomTestCase.test_normal is the same test as ConditionalSecureRandomTestCase.test_normal. Am I missing something or is one of these simply redundant?

I also noticed that I left off a closing parenthesis in the docstring for _check, that should be fixed.

I think there's still some room for improvement of the factoring, but if you're happy with it in its current state I think it's fine to merge, modulo the above mentioned issues (I expect there are more important things you could put your efforts towards).

comment:39 Changed 15 years ago by therve

(In [21257]) Address review comments.

Refs #2685

comment:40 Changed 15 years ago by therve

(In [21258]) Add explicit messages to exceptions.

Refs #2685

comment:41 Changed 15 years ago by therve

Keywords: review added
Owner: therve deleted

This is ready to review in secure-random-2685-3.

I think the current behavior is good, and don't really see how to improve the code itself. I added some note in the class to allow us to do it later, I hope it's OK.

comment:42 Changed 15 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: set to therve

A couple lines in randbytes.py have trailing whitespace. Everything else looks good now, so merge after fixing whitespace. Thanks.

comment:43 Changed 15 years ago by therve

Resolution: fixed
Status: newclosed

(In [21376]) Merge secure-random-2685-3

Authors: antoine, therve Reviewers: exarkun, radix Fixes #2685 Refs #2015

Centralize a secure random implementation in twisted.python.randbytes (relying on different backends, like os.urandom of PyCrypto randpool), to produce random bytes in a secure fashion. Also, replace custom work in twisted.names.dns and twisted.ssh.common.

comment:44 Changed 11 years ago by <automation>

Owner: therve deleted
Note: See TracTickets for help on using tickets.