Opened 3 years ago

Closed 2 years ago

#7715 enhancement closed fixed (fixed)

Extend sshsimpleserver.py to implement getPrimes

Reported by: Adi Roiban Owned by: Adi Roiban
Priority: normal Milestone:
Component: core Keywords:
Cc: Branch: branches/sshsimpleserver-getPrimes-7715-2
branch-diff, diff-cov, branch-cov, buildbot
Author: adiroiban, glyph

Description

In order for SSH server to use DH key exchange algorithm described in RFC 4419 it needs to implement getPrimes method.

It would be nice if sshsimpleserver.py example implement a minimal getPrimes method.

In this way, it can also be used for simple functional (end to end) tests during a review. For example to check interaction with outer SSH implementations.


I am also considering creating a new ticket for adding a simple SFTP server ... csftp.py can be used as SFTP client.

What do you think?

Attachments (1)

7715-1.diff (8.7 KB) - added by Adi Roiban 3 years ago.

Download all attachments as: .zip

Change History (22)

comment:1 Changed 3 years ago by Adi Roiban

Keywords: review added
Owner: Adi Roiban deleted

I have attached a first patch.

It adds primes for 1024 and 2048 bits.

I tried to keep code changes to a minimum and just improve docstring and comments.

It also fixed a bug, as with latest twisted.conch.checkers changes it no longer works as it needs to use InMemorySSHKeyDB instead of InMemoryKeyMapping

I tried to add more docstring and explain how the classes are integrated and used.

Please take a look.

Thanks!

Changed 3 years ago by Adi Roiban

Attachment: 7715-1.diff added

comment:2 Changed 3 years ago by Glyph

Author: glyph
Branch: branches/sshsimpleserver-getPrimes-7715

(In [44007]) Branching to sshsimpleserver-getPrimes-7715.

comment:3 Changed 3 years ago by Glyph

Keywords: review removed
Owner: set to Adi Roiban

Thanks for working on this issue, adiroiban.

Tests look good, but I noticed a couple of issues while reviewing.

  1. This leaves a very important question un-answered: how does one generate primes? Are they secret, or not? This problem already applies to the public key / private key pair already statically present in the file, but it's worse with primes. Many users at least have a vague notion that "private key" is private and that the important thing is to keep it private, as well as a rudimentary knowledge of ssh-keygen. The existing example at least shows what to do with the sort of key one would generate with ssh-keygen or ckeygen. In the case of the primes, though, if one were to have a file generated by ssh-keygen -G it's not clear what to do with it. My own understanding of how primes have to be used securely is that they have to be unique although not necessarily secret, which given the very short list here is unlikely to be the case :). Ideally this would give some explanation of generating primes properly, have an example for generating them for this test, and then eventually move the private keys to be generated rather than hard-coded in the example as well (as a separate task).
  2. Some of the docstrings are super unclear to me:
    1. ExampleRealm's docstring: "SSH server requires encapsulating all configuration for an account into an L{avatar.ConchUser} instance." - which SSH server requires that? Is that what ExampleRealm is?
    2. The example's docstring itself - why is it interesting that the server identifies itself using "RSA key only"? Are we talking about extending Twisted, or Conch, or extending the example? If it's "easy" to use DSA host keys, how do I do it?
  3. There are several subject/verb disagreements:
    1. In ExampleFactory's docstring:
      1. L{connection.SSHConnection} handles request for the channel multiplexing service → should be "handles requests" or "handles each request"
      2. L{userauth.SSHUserAuthServer} handlers request for tje user authentication services. → should be "handles requests" or "handles each request" (also "the" not "tje")
  4. Lines are wrapped oddly within docstrings. In ExampleFactory, there is a double-newline to separate two paragraphs, and then there is a newline after "service" even though it's not necessary to wrap there, and it's not starting a new paragraph.

My suggestion would be to move the docstring changes to another branch, address just the prime-generation stuff enough that this example isn't a source of dangerous knowledge encouraging people to manage their moduli incorrectly, and re-submit for review.

If you want to keep the docstring changes, the most important thing is that they need to make it clear that they're documenting the behavior of the example, and not the classes and functions they're present on. One way to help with this would be to make sure that the subjects are more clearly distinguished: which server, which client, which key, etc.

Finally, the bug fixed in here has no test. Given the current state of example code, it doesn't need one, but it sure would be nice if we could start requiring tests for examples. There's some work in this area which you may be interested in.

Thanks again!

comment:4 Changed 3 years ago by Adi Roiban

Keywords: review added
Owner: Adi Roiban deleted

Many thanks for the review.

It would be great to have a test for this example but I think that it might require a lot of work and should be left as a separate ticket. Meanwhile I will try to follow up with #6362 and see if I can help with the merge.

I would like to keep improving the docstrings. If I will not make good progress in this iteration I will remove them. I don't want to remove them as otherwise I will not continue to work on them in a separate dedicated ticket.

I prefer to have separate tickets for documentation about how to generate an SSH key or valid prime numbers. But this is the documentation for Conch SSH not an introduction to SSH security.

I tried to explain how primes are generated.

Please take a look at latest changes.

Thanks!

comment:5 Changed 3 years ago by Moshe Zadka

RFC 4419, Section 1, final paragraph:

"""

The ability to propose new groups will reduce the incentive to use precomputation for more efficient calculation of the discrete logarithm. The server can constantly compute new groups in the background.

"""

Unless examples created by copy-pasta sshsimpleserver become a significant proportion of SSH servers online, it is unlikely to be enough of an incentive for malicious parties to pre-compute tables just for us.

Adi -- do you think it would be useful to give people some of this context in docstrings to the example?

comment:6 in reply to:  5 Changed 3 years ago by Glyph

Replying to moshez:

Unless examples created by copy-pasta sshsimpleserver become a significant proportion of SSH servers online, it is unlikely to be enough of an incentive for malicious parties to pre-compute tables just for us.

The whole point of having this example is to give a good example of how to use pregenerated primes effectively. If we are going to create a bad example that is insecure, on the assumption that nobody will use it, then why bother having an example at all? :)

comment:7 Changed 3 years ago by Moshe Zadka

Let's go over use cases.

  1. Google decides to use sshsimpleserver.py for all of Google, but they don't bother reading the comments and just copy the keys -- in that case, even if they had generated their own keys, they would probably have generated just one (per keysize). Google is a big target -- resourced attackers, if they bother pre-computing tables at all, will pre-compute one for the Googly primes.
  1. Google decides that they're big enough, and need to resist such attacks. They do something super-fancy, like auto-generate and roll primes. Having a pre-computed prime here doesn't hurt them.
  1. Moshe decides to put sshsimpleserver.py, as-is, on his EC2 micro. Unless someone like Google (in case 1) has deployed the default sss.py keys, pre-computing tables for those keys would manage to take down Moshe (and perhaps three other friends who deployed sshsimpleserver.py as-is).

So the security risk seems pretty negligible. OTOH, not having these keys in sshsimpleserver.py means it would be that much harder to test out of the box conch's support for RFC 4419 against third-party clients.

Note that I'm not arguing against linking to the FAQ, mentioning "man moduli(5)" or talking about sshkeygen -G/-T, I'm just arguing about not having defaults in the file on the grounds of security.

comment:8 in reply to:  7 Changed 3 years ago by Glyph

Replying to moshez:

Let's go over use cases.

  1. Google decides to use sshsimpleserver.py for all of Google, but they don't bother reading the comments and just copy the keys -- in that case, even if they had generated their own keys, they would probably have generated just one (per keysize). Google is a big target -- resourced attackers, if they bother pre-computing tables at all, will pre-compute one for the Googly primes.
  1. Google decides that they're big enough, and need to resist such attacks. They do something super-fancy, like auto-generate and roll primes. Having a pre-computed prime here doesn't hurt them.
  1. Moshe decides to put sshsimpleserver.py, as-is, on his EC2 micro. Unless someone like Google (in case 1) has deployed the default sss.py keys, pre-computing tables for those keys would manage to take down Moshe (and perhaps three other friends who deployed sshsimpleserver.py as-is).

You're ignoring two important use-cases:

  1. There are people at a scale between Google and Moshe's EC2 micro. They are also going to have equally-resourced attackers.
  2. Moshe's EC2 micro is a use-case which may be repeated millions of times if a Twisted application with an SSH interface becomes popular to deploy on one's own. So they might add up to a Google without anyone ever realizing they're that big a target.

So the security risk seems pretty negligible. OTOH, not having these keys in sshsimpleserver.py means it would be that much harder to test out of the box conch's support for RFC 4419 against third-party clients.

Given that you've ignored almost all of our users in your case analysis (once you discard insignificant digits, 100% of them are at a scale somewhere between Google and 1 micro), I'm not confident in your assessment of its security impact :).

Note that I'm not arguing against linking to the FAQ, mentioning "man moduli(5)" or talking about sshkeygen -G/-T, I'm just arguing about not having defaults in the file on the grounds of security.

Forget about use-cases for the moment though. The purpose of examples is to show how to do things well. So by doing it slightly more poorly, and using canned pre-generated data rather than pre-generating and storing it on the fly when it is first needed, how are we serving the user better? Why include hard-coded primes rather than including an example of how to generate them? Given that this is example code and not built into Twisted itself, we can literally just run ssh-keygen and blow up if openssh isn't installed.

This is extra important because moduli are an extremely subtle aspect to running an SSH server. They aren't as public as public keys, but they aren't as private as private keys. It's hard to explain to a user exactly how they're supposed to manage them, and if the example code specifically says "don't manage them like this" I feel like we're making trouble for ourselves in a particularly security-sensitive area of the code.

comment:9 Changed 3 years ago by adiroiban1

My initial goal was just to have an example which shows how prime numbers are plugged into the Twisted conch.ssh framework without making it an SSH security lesson and without linking it to the OpenSSL/OpenSSH file format.

generating the whole moduli file on the fly is very slow and I wanted user to be able to run the example even if they don't have the openssl cli in PATH.


I was hoping that by creating a very weak data set will discourage user from using this example in production.

I don't think that generating data on the fly using openssl CLI should be the right thing to do (TM) in production. We might need to update python-cryptography to expose BN_generate_prime binding or provide a helper for generated large random prime numbers.

Even if we have a good example, I think that people implementing SSH should still read RFC 4419 and the other SSH related RFC and not just the RFC fragments included in this example.


I agree that we should not encourage uses to use weak data but this is security related code and users should know better that they should not use copy/paste example codes in production.

I will try to update the example to instruct users to create their own module file and then the example will use twisted.conch.openssh_compat.primes.parseModuliFile to read the newly generated file.

Thanks for your feedback!

comment:10 in reply to:  9 Changed 3 years ago by Glyph

Keywords: review removed
Owner: set to Adi Roiban

Replying to adiroiban1:

My initial goal was just to have an example which shows how prime numbers are plugged into the Twisted conch.ssh framework without making it an SSH security lesson and without linking it to the OpenSSL/OpenSSH file format.

You can generate the primes some other way, but generating them incorrectly turns it into a lesson about how the wrong primes can be plugged in to Twisted in the wrong way.

generating the whole moduli file on the fly is very slow and I wanted user to be able to run the example even if they don't have the openssl cli in PATH.

In principle this is a worthwhile goal, to minimize dependencies on competing software. In practice, we just need our own way to generate primes, because we can't practically use this in production without some way to generate them.

I was hoping that by creating a very weak data set will discourage user from using this example in production.

How would the user know that this is a very weak dataset? It's a giant, magical number that they almost certainly don't know the properties of.

I don't think that generating data on the fly using openssl CLI should be the right thing to do (TM) in production. We might need to update python-cryptography to expose BN_generate_prime binding or provide a helper for generated large random prime numbers.

I agree here in terms of the "real right way" to do things, but the openssl CLI is an acceptable way to do this in production; using a short list of primes like this is not.

Even if we have a good example, I think that people implementing SSH should still read RFC 4419 and the other SSH related RFC and not just the RFC fragments included in this example.

I don't think they should, any more than people writing web applications should have to read every HTTP RFC. The whole point of using Twisted is that it can insulate you from some of that stuff.

I agree that we should not encourage uses to use weak data but this is security related code and users should know better that they should not use copy/paste example codes in production.

Having seen numerous production users of InMemoryUsernamePasswordDatabaseDontUse, I have no expectation that users should know better. Why would they? The example code is there to set a good example.

I will try to update the example to instruct users to create their own module file and then the example will use twisted.conch.openssh_compat.primes.parseModuliFile to read the newly generated file.

I still think there is no point in showing how this data is plugged in if you're not also showing how it can be generated directly as part of the example.

Thanks for your feedback!

Here's my next review:

Despite its numerous problems, I think we should land this change. As I'm going through this, I realize that we are already including a static private key in this file, which is a far worse and more problematic example. This change at least adds some warnings that said practice might be problematic.

Before landing though, please address these:

  1. Coding style nitpick: ExampleSession.__init__ and .getPty have both a docstring and a pass statement. The only reason we would ever use a pass statement in Twisted is to avoid writing a docstring in some example code, so please just remove those.
  2. Instead of a generic "warning, this code is bad", please put specific warnings into the top-level docstring:
    1. re-using a private key is dangerous, generate one
    2. re-using DH primes and having such a short primes list is dangerous, generate some

And please file follow-up issues for all of the following before landing:

  1. This example should be generating all of its keys to provide a better example of how to do things and to stop having automatic scanners pick up the -----BEGIN RSA PRIVATE KEY----- line in our source browser.
  2. This example should also be generating its primes somehow. The OpenSSH command line would be acceptable to me but I'd be even happier with a better way of doing it.
  3. Far, far too much is going on in test docstrings. Example code really ought to be in service of a narrative document; this should be translated into a ReST document explaining how to build an SSH server, that can cover the security issues with more nuance than a comment that people will simply ignore as they copy and paste :).

Thanks for sticking with this very subtle and complex issue. I hope I can convince you to fix some of the follow-ups as well ;-).

comment:11 Changed 3 years ago by Glyph

(Sorry, by "test docstrings" I meant "example docstrings).

comment:12 Changed 3 years ago by Glyph

By the way, this dropped today - https://weakdh.org

Maybe the example shouldn't even show a 1024-bit prime?

comment:13 Changed 3 years ago by Moshe Zadka

http://www.scottaaronson.com/blog/?p=2293

We should make this *scary easy* :)

comment:14 Changed 2 years ago by Adi Roiban

Keywords: review added
Owner: Adi Roiban deleted

Many thanks for the feedback!

I agree that an example should be considered "good example", but when we have such a complex system, simplification should help in understanding how it works.

I agree that this example is bad, but it is already bad in trunk and I try to make it less bad.

One year has already passed and we still have the bad example in trunk :(

With the current Twisted API we can not generate SSH or DH primes :(

I have removed the 1024 bit primes.

I have created follow up tickets:

  1. #8083 - for generating the SSH keys.
  2. #8084 - for generating DH primes
  3. #8085 - for creating the narrative SSH server

As soon as conch is migrated to Python cryptography I will continue to work on those ticket. I prefer to not add more PyCrypto code just to be deprecated later.

Please check latest changes.

Thanks!

comment:15 Changed 2 years ago by Adi Roiban

I have updated the example to not re-use the SSH keys and instruct users how to generate their keys... I assume that Windows users will need some extra steps to install OpenSSH on Windows.

I hope that once we have cryptography in place I can work on #8083 and generate the keys from Python.

I have also updated the ssh client example to allow a custom SSH key and port so that we can use it together with the server example .

comment:16 Changed 2 years ago by Glyph

Keywords: review removed
Owner: set to Adi Roiban

Thanks for your tireless work on this increasingly large branch, Adi :). Things are looking much better than the last time I reviewed this.

  1. Unfortunately it's stuck, as the buildbots won't build it because it requires a merge forward.

Please address that one mandatory point and resubmit; I think this can probably land assuming it can merge cleanly and pass tests.

Some other observations:

  1. In every case where we can, we should probably recommend ckeygen or conch as the program to run rather than ssh-keygen or ssh.
  2. We already have a parser for OpenSSH-format moduli files; as you mentioned it, twisted.conch.openssh_compat.primes.parseModuliFile. Rather than explain they can look up the format in the comment, why not direct them to the parser for it? (Also you said you were going to update the example in the last comment; are you planning to do that in the future?)
  3. Maybe you should split this into two tickets? The additional docstrings are unambiguously helpful and not really related to the moduli stuff at all; we could just land those quickly if you wanted to continue polishing the moduli stuff.

Hopefully I'll get to the re-review fast enough next time that we won't have conflict issues. Thanks again!

comment:17 Changed 2 years ago by Adi Roiban

Author: glyphadiroiban, glyph
Branch: branches/sshsimpleserver-getPrimes-7715branches/sshsimpleserver-getPrimes-7715-2

(In [46552]) Branching to sshsimpleserver-getPrimes-7715-2.

comment:18 Changed 2 years ago by Adi Roiban

Keywords: review added
Owner: Adi Roiban deleted

Thanks for the review!

Mandatory.

  1. Conflict solved. It was due to my changes in the client examples.

Optional.

  1. I have updated it to use ckeygen. I have not updated it to use conch as, at least for me, conch is broken - https://gist.github.com/adiroiban/63916ff8b1dd61a0c7f8
  1. I have updated the example to talk about the parseModuleFile. The client example should have been already updated. What example to update?
  1. I prefer to keep it as a single ticket and land it as it is. Then continue to work in #8084. Beside the comments, as a drive-by, this also fixes the SSHPublicKeyChecker usage with the new InMemorySSHKeyDB

The server example still broken and does not work with the client example, but I hope that this patch is an improvement. I plant update the server example in #8085 together with some narrative docs for the server side.

With the current observed interest in reviewing conch docs improvements I am not very motivated to continue this work :(


Please check latest changes.

Thanks!

comment:19 Changed 2 years ago by Glyph

Keywords: review removed
Owner: set to Adi Roiban

All right. This is a big improvement, and there's more to discuss, but there's no reason to block the change as-is. Can you report the conch traceback as its own bug, though? Thanks.

comment:20 Changed 2 years ago by Glyph

LGTM LAND

comment:21 Changed 2 years ago by Adi Roiban

Resolution: fixed
Status: newclosed

(In [46565]) Merge sshsimpleserver-getPrimes-7715-2: Extend sshsimpleserver.py to implement getPrimes and fix the credentials checker.

Author: adiroiban Reviewer: glyph Fixes: #7715

Note: See TracTickets for help on using tickets.