Opened 5 years ago

Closed 5 years ago

#6256 defect closed fixed (fixed)

Long-lived KnownHostsFile instance will clobber changes made by others if its `save` method is used

Reported by: Jean-Paul Calderone Owned by: Jean-Paul Calderone
Priority: normal Milestone:
Component: conch Keywords:
Cc: z3p Branch: branches/knownhostsfile-6256
branch-diff, diff-cov, branch-cov, buildbot
Author: exarkun

Description

Entries from the file are only read in KnownHostsFile.fromPath - ie, at construction time. Any future changes made to the file, either by other code in the same process or (perhaps more likely) other processes, will be wiped out as soon as save is used. In particular, consider a long-running KnownHostsFile-using process which shares a known_hosts file with a user with their own command line ssh process. Any changes (eg, keys added to the file by ssh or deleted by ssh-keygen -R) will be lost each time KnownHostsFile.save is invoked.

Instead, some minimal level of safety could be introduced by re-reading the file at the beginning of save and merging the on-disk data into the in-memory data (which can only be additions given the current KnownHostsFile API).

An optimization to this logic might be to check the mtime of the file at strategic points and use this as an indicator that the file may have changed and needs to be re-read.

An internal refactoring that will probably make the logic for this simpler would be to keep additions separate from the list of entries read from (or most recently written to) the underlying file. This makes the merge much easier (throw away the list of persisted entries and replace it with the contents of the file at the beginning of save) and also points the way towards supporting removals, which might be a nice API addition.

Change History (11)

comment:1 Changed 5 years ago by DefaultCC Plugin

Cc: z3p added

comment:2 Changed 5 years ago by Jean-Paul Calderone

Author: exarkun
Branch: branches/knownhostsfile-6256

(In [37026]) Branching to 'knownhostsfile-6256'

comment:3 Changed 5 years ago by Jean-Paul Calderone

(In [37033]) Restore the behavior of KnownHostsFile(path)

I think.

This behavior is kind of silly. I do not really want to provide it, particularly not by default. But... backwards compatibility. Perhaps it should at least be deprecated.

refs #6256

comment:4 Changed 5 years ago by Jean-Paul Calderone

Keywords: review added

comment:5 Changed 5 years ago by Jean-Paul Calderone

(In [37038]) Satisfy Python 3.x

There is no cStringIO on 3.x. io.BytesIO is the 2.6, 2.7, 3.3 compatible spelling of the name of the object that's desired here. Necessary since twisted.names.dns tests depend on this module now, and twisted.names.dns is supposed to work on Python 3.x

refs #6256

comment:6 Changed 5 years ago by Tom Prince

Keywords: review removed
Owner: set to Jean-Paul Calderone

1, The implementation has the strange behavior that ignores external changes until .save() is

called, then starts reflecting them. I think this is liable to cause surprise.

  1. test_defaultInitializerClobbersExisting seems to test that the underlying file is actually clobbered only indirectly. Particularly in view of 1, it would be better to test that the underlying file is as we expect it.
  2. (minor) HashedEntryTests.test_equality is getting run twice, since HashedEntryWithCommentTests inherits from HashedEntryTests.
  3. There isn't a test of an added host with mismatched host-key.

Please resubmit for review, after addressing the above issues.

comment:7 Changed 5 years ago by Jean-Paul Calderone

Keywords: review added
Owner: Jean-Paul Calderone deleted

Thanks.

1, The implementation has the strange behavior that ignores external changes until .save() is called, then starts reflecting them. I think this is liable to cause surprise.

If KnownHostsFile.fromPath is used to construct the instance, then this isn't the case. The contents of the file will be reflected by iterentries (therefore also hasHostKey and verifyHostKey). If KnownHostsFile.__init__ is used directly, then the contents will indeed be ignored, but they'll also be overwritten by the first call to save. This makes me think it is correct to ignore them.

I don't particularly like this API or behavior, but it's basically the same as the current trunk behavior without these changes, so I think retaining them makes sense (though perhaps replacing these KnownHostsFile APIs with something new and less surprising would also make sense).

test_defaultInitializerClobbersExisting seems to test that the underlying file is actually clobbered only indirectly. Particularly in view of 1, it would be better to test that the underlying file is as we expect it.

Fixed in r37319.

(minor) HashedEntryTests.test_equality is getting run twice, since HashedEntryWithCommentTests inherits from HashedEntryTests.

True. This is a good example demonstrating how having subclasses of TestCases is not ideal. I'd rather not refactor these classes as part of this branch. I can file a ticket for this issue.

There isn't a test of an added host with mismatched host-key.

I'm not sure what this means.

comment:8 Changed 5 years ago by Tom Prince

Keywords: review removed
Owner: set to Jean-Paul Calderone

1, The implementation has the strange behavior that ignores external changes until .save() is called, then starts reflecting them. I think this is liable to cause surprise.

If KnownHostsFile.fromPath is used to construct the instance, then this isn't the case. The contents of the file will be reflected by iterentries (therefore also hasHostKey and verifyHostKey). If KnownHostsFile.init is used directly, then the contents will indeed be ignored, but they'll also be overwritten by the first call to save. This makes me think it is correct to ignore them.

The unexpected behavior I am commenting on is demonstrated by the following script.

!python
from twisted.conch.client.knownhosts import KnownHostsFile
from twisted.conch.ssh.keys import Key
from twisted.python.filepath import FilePath

keyString = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCVU1feuPKpCnAR2WuVmTKUM5l7rEvansUu6WdypCKcM05EwaWGih6v0hHx+PzeJ5vjt2RfVXei3o664hYXKCYsQ0a3ROOGnF0kRu3gZKqMQa9ubndCUQUeLqGzV6aYZigjPs/B0Nd/MuyV34+9pjSTWSNXV1D6abK/e+bDY+l+RUCnJPpGfr4Rbys7XiafjAdU83l++kVhC9vOzlGBU5pDZ1dCTrSDw86YzPuThtxajy2xyjpdy4VXg1iLKuKqerTjzBTj2VsQJyKnZzeOeYkUffcBjfxmy/0ACq3FeKg2cr7GTb9Q7pQSSduh/uTTamhrQGTyYDsQhpCjih2468I9 example"
key = Key.fromString(keyString)
khf = KnownHostsFile(FilePath("known_hosts"))
khf.addHostKey("example.net", key)
FilePath("known_hosts").setContent("example.com " + keyString)
print khf.hasHostKey("example.com", key) # False
print khf.hasHostKey("example.net", key) # True
khf.save()
FilePath("known_hosts").setContent("example.com " + keyString)
print khf.hasHostKey("example.com", key) # True but expect False
print khf.hasHostKey("example.net", key) # False but expect True

To get the expected behavior (which I agree isn't sensible), I think you just need to not reset _clobber or _added if _clobber == True.

There isn't a test of an added host with mismatched host-key.

I'm not sure what this means.

!python
#psudeo-code
f = FilePath(...)
khf = KnownHostFile.fromPath(f)
f.setContent("example.com ssh-rsa AAAAA")
knf.hasHostKey("example.com", key.fromString("ssh-rsa BBBBB")) # Raises

comment:9 Changed 5 years ago by Jean-Paul Calderone

Keywords: review added
Owner: Jean-Paul Calderone deleted

Thanks.

The unexpected behavior I am commenting on is demonstrated by the following script.

I guess I would like to call that behavior a bug as well and change it. The trunk behavior without this change is to make changes to the underlying file invisible (and, by being unaware of them, clobber them sometimes).

The behavior of the branch is better, it reflects changes to the contents of the file and avoids wiping those changes out when save is called - except for the documented case where __init__ is used directly to explicitly create an "empty" KnownHostsFile.

There isn't a test of an added host with mismatched host-key.

There is a test for a mismatched host key, though. I don't think the source of the host key is terribly important, since hasHostKey is implemented with iterentries and iterentries is tested to verify that it returns PlainEntry or HashedEntry instances either from the file or from the in-memory _added buffer.

But:

  1. It's an easy test to add
  2. It's sort of nonsensical for the HostKeyMismatch exception to have lineno and path information in the case of in-memory entries
  3. There was a bug in the line number calculation for saved entries when any unsaved entries existed

So I added a couple tests and fixed that bug in r37326. I'm not exactly thrilled by the fix for the bug, but it's a small change and this approach avoids having to make iterentries more complicated or introduce another helper for hasHostKey to use. I'm open to suggestions for other approaches.

comment:10 Changed 5 years ago by Tom Prince

Keywords: review removed
Owner: set to Jean-Paul Calderone
  • except for the documented case where init is used directly to explicitly create an "empty" KnownHostsFile.

That is the behavior that my example script is exercising and which has (to me) surprising behavior. But, I don't care either way, and nobody should be using this anyway.

I'm not exactly thrilled by the fix for the bug, but it's a small change and this approach avoids having to make iterentries more complicated or introduce another helper for hasHostKey to use. I'm open to suggestions for other approaches.

I agree with your sentiment, but don't have any better ideas. (Maybe add a comment that lineidx is 0 based and line is 1 based?

Go ahead and merge.

comment:11 Changed 5 years ago by Jean-Paul Calderone

Resolution: fixed
Status: newclosed

(In [37334]) Merge knownhostsfile-6256

Author: exarkun Reviewer: tom.prince Fixes: #6256

Change twisted.conch.client.knownhosts.KnownHostsFile so that it preserves changes made to the underlying file made by means other than its own addHostKey API and so that it reflects those changes in future hasHostKey and verifyHostKey calls.

Note: See TracTickets for help on using tickets.