This new automatic decryption policy should make it possible to
decrypt messages that we have stashed session keys for, without
incurring a call to the user's asymmetric keys.
the command-line interface for indexing (reindex, new, insert) used
--try-decrypt; and the configuration records used index.try_decrypt.
But by comparison with "show" and "reply", there doesn't seem to be
any reason for the "try" prefix.
This changeset adjusts the command-line interface and the
configuration interface.
For the moment, i've left indexopts_{set,get}_try_decrypt alone. The
subsequent changeset will address those.
test: session keys are known broken without session key support
If the version of GMime we're building against doesn't support session
key extraction or re-use, mark the tests that rely on session key
capabilities as known-broken.
This should resolve test suite failures on ubuntu trusty and debian
jessie and earlier, which have GMime 2.6.20 -- session key support was
introduced in GMime 2.6.21.
crypto: Test restore of cleartext index from stashed session keys
If you've got a notmuch dump that includes stashed session keys for
every decrypted message, and you've got your message archive, you
should be able to get back to the same index that you had before.
Here we add a simple test that give some flavor of how that works.
crypto: use stashed session-key properties for decryption, if available
When doing any decryption, if the notmuch database knows of any
session keys associated with the message in question, try them before
defaulting to using default symmetric crypto.
This changeset does the primary work in _notmuch_crypto_decrypt, which
grows some new parameters to handle it.
The primary advantage this patch offers is a significant speedup when
rendering large encrypted threads ("notmuch show") if session keys
happen to be cached.
Additionally, it permits message composition without access to
asymmetric secret keys ("notmuch reply"); and it permits recovering a
cleartext index when reindexing after a "notmuch restore" for those
messages that already have a session key stored.
Note that we may try multiple decryptions here (e.g. if there are
multiple session keys in the database), but we will ignore and throw
away all the GMime errors except for those that come from last
decryption attempt. Since we don't necessarily know at the time of
the decryption that this *is* the last decryption attempt, we'll ask
for the errors each time anyway.
This does nothing if no session keys are stashed in the database,
which is fine. Actually stashing session keys in the database will
come as a subsequent patch.
configure: session key handling in gmime maps to built_with("session_key")
This flag should make it easier to write the code for session-key
handling.
Note that this only works for GMime 2.6.21 and later (the session key
interface wasn't available before then). It should be fine to build
the rest of notmuch if this functionality isn't available.
Note that this also adds the "session_key" built_with() aspect to
libnotmuch.
crypto: add _notmuch_crypto_decrypt wrapper function
We will use this centralized function to consolidate the awkward
behavior around different gmime versions.
It's only invoked from two places: mime-node.c's
node_decrypt_and_verify() and lib/index.cc's
_index_encrypted_mime_part().
However, those two places have some markedly distinct logic, so the
interface for this _notmuch_crypto_decrypt function is going to get a
little bit clunky. It's worthwhile, though, for the sake of keeping
these #if directives reasonably well-contained.
David Bremner [Sat, 2 Dec 2017 12:53:04 +0000 (08:53 -0400)]
test: unbreak performance tests
In 8e7fb88237aedea22 Jani replaced the use of $(pwd -P) to find the
TEST_DIRECTORY in order to better support out of tree
builds. Unfortunately the performance-tests need a different value for
the variable and were thus broken.
This commit splits out the setting of this variable for the two sets
of tests. Performance tests still don't work out of tree, because
the handling of the downloaded corpus needs to be updated.
Jani Nikula [Sat, 14 Oct 2017 13:15:43 +0000 (16:15 +0300)]
cli: allow empty strings for notmuch insert --folder argument
Now that it's easy to add argument specific modifiers in opt
descriptions, add a new .allow_empty field to allow empty strings for
individual string arguments while retaining strict checks
elsewhere. Use this for notmuch insert --folder, where the empty
string means top level folder.
Jani Nikula [Mon, 16 Oct 2017 15:40:44 +0000 (18:40 +0300)]
cli/crypto: fix segfault on failed gmime2 crypto context creation
Commit 1fdc08d0ffab ("cli/crypto: treat failure to create a crypto
context as fatal.") started treating crypto context creation failures
"as fatal", returning NULL from _mime_node_create().
Unfortunately, we do not have NULL checks for _mime_node_create()
failures. The only caller, mime_node_child(), could check and return
NULL (as it's documented to do on errors) but none of the several call
sites have NULL checks either. And none of them really have a trivial
but feasible and graceful way of recovery.
So while the right thing to do would be to handle NULL returns
properly all over the place, and we have other scenarios that do
return NULL from above mentioned functions, the crypto context
creation failure is something that does seem to show up regularly in
some scenarios, revert back to the functionality before commit 1fdc08d0ffab as an interim fix.
Jani Nikula [Tue, 24 Oct 2017 18:36:11 +0000 (21:36 +0300)]
test: fix test database backup/restore location
backup_database() and restore_database() used to store the backups in
the test specific temporary directory, through the current working
directory being there. Commit 8e7fb88237ae ("test: use source and
build paths in test-lib-common.sh") started using a test specific
backup directories under the build tree test directory. This was in
error. Switch back to the old location, but using paths to the
location instead of relying on current working directory.
Reported by Daniel Kahn Gillmor <dkg@fifthhorseman.net>.
Enable override of the index.try_decrypt setting on a per-run basis
when invoking "notmuch reindex". This allows the possibility of (for
example) an emacs keybinding that adds the cleartext of the currently
shown decrypted message to the index, making it searchable in the
future.
It also enables one-time indexing of all messages matching some query,
like so:
notmuch reindex tag:encrypted and\
not property:index.decryption=success and\
from:alice@example.org
We also update the documentation and tab completion, and add a few
more tests.
tests: emacs_fcc_message: allow passing --arguments to notmuch new
Subsequent patches may want to send GNU-style --long-arguments to
notmuch new in the test suite, in particular when invoking
emacs_fcc_message. This changeset makes that possible.
cli: set up shared command-line arguments for indexing
We have an indexopts structure for manipulating indexing in different
ways, but we also have three command-line invocations that can trigger
indexing: new, insert, and reindex.
This changeset prepares a common parser that these subcommands can
share.
Note: if the deprecated crypto.gpg_path configuration option is set to
anything other than "gpg", we ignore it (and print a warning on
stderr, if built against gmime < 3.0).
At the moment, it's just --try-decrypt, but others will likely follow.
By default, notmuch won't try to decrypt on indexing. With this
patch, we make it possible to indicate a per-database preference using
the config variable "index.try_decrypt", which by default will be
false.
At indexing time, the database needs some way to know its internal
defaults for how to index encrypted parts. It shouldn't be contingent
on an external config file (since that can't be retrieved from the
database object itself), so we store it in the database.
This behaves similarly to the query.* configurations, which are also
stored in the database itself, so we're not introducing any new
dependencies by requiring that it be stored in the database.
crypto: index encrypted parts when indexopts try_decrypt is set.
If we see index options that ask us to decrypt when indexing a
message, and we encounter an encrypted part, we'll try to descend into
it.
If we can decrypt, we add the property index.decryption=success.
If we can't decrypt (or recognize the encrypted type of mail), we add
the property index.decryption=failure.
Note that a single message may have both values of the
"index.decryption" property: "success" and "failure". For example,
consider a message that includes multiple layers of encryption. If we
manage to decrypt the outer layer ("index.decryption=success"), but
fail on the inner layer ("index.decryption=failure").
Because of the property name, this will be automatically cleared (and
possibly re-set) during re-indexing. This means it will subsequently
correspond to the actual semantics of the stored index.
reindex: drop all properties named with prefix "index."
This allows us to create new properties that will be automatically set
during indexing, and cleared during re-indexing, just by choice of
property name.
index: implement notmuch_indexopts_t with try_decrypt
This is currently mostly a wrapper around _notmuch_crypto_t that keeps
its internals private and doesn't expose any of the GMime API.
However, non-crypto indexing options might also be added later
(e.g. filters or other transformations).
Jani Nikula [Sat, 21 Oct 2017 11:58:02 +0000 (14:58 +0300)]
test: remove --root option and fix TMP_DIRECTORY cleanup
The primary motivation here is to fix TMP_DIRECTORY cleanup prior to
running each test when the current working directory is not the test
subdirectory. Tests with failures would leave their TMP_DIRECTORY
directory behind for debugging, and repeated out-of-tree test runs
would have old temp directories. (This lead to e.g. T310-emacs.sh
hanging because emacs would prompt for overwriting files.)
We remove the likely anyway defunct --root test option while at it,
just to be on the safe side when doing 'rm -rf' on the TMP_DIRECTORY.
Jani Nikula [Mon, 25 Sep 2017 20:38:17 +0000 (23:38 +0300)]
test: use corpora/default not corpus in T480-hex-escaping.sh
Turns out round trip tests didn't really round trip anything. Broken
by yours truly in 971cdc72cdb8 ("test: make it possible to have
multiple corpora"). Ooops.
Subsequent patches will introduce a convention that properties whose
name starts with "index." will be stripped (and possibly re-added)
during re-indexing. This patch lays the groundwork for doing that.
Jani Nikula [Sat, 14 Oct 2017 20:18:36 +0000 (23:18 +0300)]
cli: make notmuch count --exclude a boolean argument
Commit 0f314c0c99be ("cli: convert notmuch_bool_t to stdbool")
over-eagerly converted EXCLUDE_TRUE and EXCLUDE_FALSE to EXCLUDE_true
and EXCLUDE_false in notmuch-count.c. We could just fix the case back,
but convert the option to an actual boolean argument instead.
We've used a keyword argument rather than a boolean argument for
notmuch count --exclude for five years, since commit 785c1e497f05
("cli: move count to the new --exclude=(true|false|flag) naming
scheme."), "to allow future options to be added more easily". I think
we can conclude future options aren't coming any time soon.
gmime-extra: drop compat layer for g_mime_multipart_encrypted_decrypt
In practice, we're going to see this function invoked differently
depending on which gmime we build against. The compatibility layer
forces our code into the lowest-common-denominator -- unable to make
use of new features even when built against a newer version.
Dropping the compatibility layer paves the way for clearer use of
features from GMime 3.0 in future commits.
"typedef GMimeAddressType GMimeRecipientType" is already present
further down in the compatibility wrapper (with other typedefs). We
don't need it twice.
crypto: _notmuch_crypto_cleanup should return void
There's no chance that _notmuch_crypto_cleanup() will ever return
anything other than 0, and no one ever checks its return value anyway.
So make it return void instead of int.
notmuch_crypto_context_t was introduced (i think) as some sort of
abstraction layer to make notmuch somewhat independent of GMime. But
it isn't even useful for GMime 3.0 or later -- we can drop the
pretense that it's some sort of abstraction in this case, and just
call it what it is, GMimeCryptoContext, which is useful for building
against older versions of GMime.
This also renames _notmuch_crypto_get_context() to
_notmuch_crypto_get_gmime_context().
crypto: rename notmuch_crypto_t to _notmuch_crypto_t
The notmuch_crypto_t struct isn't used externally, and we have no
plans to explicitly export it. Prefix its name (and associated
functions) with _ to make that intent clear.
Jani Nikula [Sat, 7 Oct 2017 08:44:05 +0000 (11:44 +0300)]
lib: convert notmuch_bool_t to stdbool internally
C99 stdbool turned 18 this year. There really is no reason to use our
own, except in the library interface for backward
compatibility. Convert the lib internally to stdbool.
Jani Nikula [Sat, 7 Oct 2017 08:44:04 +0000 (11:44 +0300)]
cli: convert notmuch_bool_t to stdbool
C99 stdbool turned 18 this year. There really is no reason to use our
own, except in the library interface for backward
compatibility. Convert the cli and test binaries to stdbool.
Jani Nikula [Sun, 1 Oct 2017 20:53:16 +0000 (23:53 +0300)]
cli: use the arg parser .present feature to handle show --entire-thread
The --entire-thread default depends on other arguments, so we'll have
to figure out if it was explicitly set by the user or not. The arg
parser .present feature helps us clean up the code here.
Jani Nikula [Sun, 1 Oct 2017 20:53:14 +0000 (23:53 +0300)]
cli: add .present field to opt desc to check if the arg was present
Add pointer to boolean .present field to opt desc, which (if non-NULL)
will be set to TRUE if the argument in question is present on the
command line. Unchanged otherwise.
Jani Nikula [Sun, 1 Oct 2017 20:53:11 +0000 (23:53 +0300)]
cli: use designated initializers for opt desc
Several changes at once, just to not have to change the same lines
several times over:
- Use designated initializers to initialize opt desc arrays.
- Only initialize the needed fields.
- Remove arg_id (short options) as unused.
- Replace opt_type and output_var with several type safe output
variables, where the output variable being non-NULL determines the
type. Introduce checks to ensure only one is set. The downside is
some waste of const space per argument; this could be saved by
retaining opt_type and using a union, but that's still pretty
verbose.
- Fix some variables due to the type safety. Mostly a good thing, but
leads to some enums being changed to ints. This is pedantically
correct, but somewhat annoying. We could also cast, but that defeats
the purpose a bit.
- Terminate the opt desc arrays using {}.
The output variable type safety and the ability to add new fields for
just some output types or arguments are the big wins. For example, if
we wanted to add a variable to set when the argument is present, we
could do so for just the arguments that need it.
Beauty is in the eye of the beholder, but I think this looks nice when
defining the arguments, and reduces some of the verbosity we have
there.
Jani Nikula [Sun, 1 Oct 2017 20:53:10 +0000 (23:53 +0300)]
cli: strip trailing "/" from the final maildir path in notmuch insert
Several subtle interconnected changes here:
- If the folder name passed as argument is the empty string "" or
slash "/", the final maildir path would end up having "//" in it. We
should strip the final maildir path, not folder.
- The folder variable should really be const char *, another reason
not to modify it.
- The maildir variable is only const to let us point it at db_path
directly.
To be able to strip the maildir variable, always allocate it. Default
folder to the empty string "", and don't treat folder not being
present on the command line as anything special.
As a side effect, we also create the cur/new/tmp in the top level
directory if they're not there and --create-folder is given.
T390-python: add test for get_message_parts and special characters
This imports a message with ISO-8859-2 encoded characters, then opens
the database using the python bindings. We peek through all mesage
parts, afterwards print the message id.
Signed-off-by: Florian Klink <flokli@flokli.de> Signed-off-by: Andreas Rammhold <andreas@rammhold.de>
currently, notmuch's get_message_parts() opens the file in text mode and passes
the file object to email.message_from_file(fp). In case the email contains
UTF-8 characters, reading might fail inside email.parser with the following exception:
File "/usr/lib/python3.6/site-packages/notmuch/message.py", line 591, in get_message_parts
email_msg = email.message_from_binary_file(fp)
File "/usr/lib/python3.6/email/__init__.py", line 62, in message_from_binary_file
return BytesParser(*args, **kws).parse(fp)
File "/usr/lib/python3.6/email/parser.py", line 110, in parse
return self.parser.parse(fp, headersonly)
File "/usr/lib/python3.6/email/parser.py", line 54, in parse
data = fp.read(8192)
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 1865: invalid continuation byte
To fix this, read file in binary mode and pass to
email.message_from_binary_file(fp).
Unfortunately, Python 2 doesn't support
email.message_from_binary_file(fp), so keep using
email.message_from_file(fp) there.
Jani Nikula [Tue, 26 Sep 2017 18:26:08 +0000 (21:26 +0300)]
emacs: sanitize subject in replies
Commit a7964c86d125 ("emacs: Sanitize authors and subjects in search
and show") added sanitization of header information for display. Do
the same for reply subjects.
This fixes the long-standing annoying artefact of certain versions of
mailman using tab as folding whitespace, leading to tabs in reply
subjects.
lib: index the content-type of the parts of encrypted messages
This is a logical followup to "lib: index the content type of
signature parts", which will make it easier to record the message
structure of all messages.
Jani Nikula [Sat, 16 Sep 2017 14:59:04 +0000 (17:59 +0300)]
lib: index the content type of signature parts
It's useful (*) to be able to easily find messages with certain types
of signatures. Having the mimetype: prefix searches fail for some
content types is also genuinely surprising (*). Index the content type
of signature parts.
While at it, switch to the gmime convenience constants for content and
signature part indexes.