the http basic auth we had was very simple to reason about, and to implement.
but it has a major downside:
there is no way to logout, browsers keep sending credentials. ideally, browsers
themselves would show a button to stop sending credentials.
a related downside: the http auth mechanism doesn't indicate for which server
paths the credentials are.
another downside: the original password is sent to the server with each
request. though sending original passwords to web servers seems to be
considered normal.
our new approach uses session cookies, along with csrf values when we can. the
sessions are server-side managed, automatically extended on each use. this
makes it easy to invalidate sessions and keeps the frontend simpler (than with
long- vs short-term sessions and refreshing). the cookies are httponly,
samesite=strict, scoped to the path of the web interface. cookies are set
"secure" when set over https. the cookie is set by a successful call to Login.
a call to Logout invalidates a session. changing a password invalidates all
sessions for a user, but keeps the session with which the password was changed
alive. the csrf value is also random, and associated with the session cookie.
the csrf must be sent as header for api calls, or as parameter for direct form
posts (where we cannot set a custom header). rest-like calls made directly by
the browser, e.g. for images, don't have a csrf protection. the csrf value is
returned by the Login api call and stored in localstorage.
api calls without credentials return code "user:noAuth", and with bad
credentials return "user:badAuth". the api client recognizes this and triggers
a login. after a login, all auth-failed api calls are automatically retried.
only for "user:badAuth" is an error message displayed in the login form (e.g.
session expired).
in an ideal world, browsers would take care of most session management. a
server would indicate authentication is needed (like http basic auth), and the
browsers uses trusted ui to request credentials for the server & path. the
browser could use safer mechanism than sending original passwords to the
server, such as scram, along with a standard way to create sessions. for now,
web developers have to do authentication themselves: from showing the login
prompt, ensuring the right session/csrf cookies/localstorage/headers/etc are
sent with each request.
webauthn is a newer way to do authentication, perhaps we'll implement it in the
future. though hardware tokens aren't an attractive option for many users, and
it may be overkill as long as we still do old-fashioned authentication in smtp
& imap where passwords can be sent to the server.
for issue #58
all ui frontend code is now in typescript. we no longer need jshint, and we
build the frontend code during "make build".
this also changes tlsrpt types for a Report, not encoding field names with
dashes, but to keep them valid identifiers in javascript. this makes it more
conveniently to work with in the frontend, and works around a sherpats
limitation.
we don't want external software to include internal details like mlog.
slog.Logger is/will be the standard.
we still have mlog for its helper functions, and its handler that logs in
concise logfmt used by mox.
packages that are not meant for reuse still pass around mlog.Log for
convenience.
we use golang.org/x/exp/slog because we also support the previous Go toolchain
version. with the next Go release, we'll switch to the builtin slog.
according to the rfc's (2231, and 2047), non-ascii filenames in content-type
and content-disposition headers should be encoded like this:
Content-Type: text/plain; name*=utf-8''hi%E2%98%BA.txt
Content-Disposition: attachment; filename*=utf-8''hi%E2%98%BA.txt
and that is what the Go standard library mime.ParseMediaType and
mime.FormatMediaType parse and generate.
this is what thunderbird sends:
Content-Type: text/plain; charset=UTF-8; name="=?UTF-8?B?aGnimLoudHh0?="
Content-Disposition: attachment; filename*=UTF-8''%68%69%E2%98%BA%2E%74%78%74
(thunderbird will also correctly split long filenames over multiple parameters,
named "filename*0*", "filename*1*", etc.)
this is what gmail sends:
Content-Type: text/plain; charset="US-ASCII"; name="=?UTF-8?B?aGnimLoudHh0?="
Content-Disposition: attachment; filename="=?UTF-8?B?aGnimLoudHh0?="
i cannot find where the q/b-word encoded values in "name" and "filename" are
allowed. until that time, we try parsing them unless in pedantic mode.
we didn't generate correctly encoded filenames yet, this commit also fixes that.
for issue #82 by mattfbacon, thanks for reporting!
getting mox to compile required changing code in only a few places where
package "syscall" was used: for accessing file access times and for umask
handling. an open problem is how to start a process as an unprivileged user on
windows. that's why "mox serve" isn't implemented yet. and just finding a way
to implement it now may not be good enough in the near future: we may want to
starting using a more complete privilege separation approach, with a process
handling sensitive tasks (handling private keys, authentication), where we may
want to pass file descriptors between processes. how would that work on
windows?
anyway, getting mox to compile for windows doesn't mean it works properly on
windows. the largest issue: mox would normally open a file, rename or remove
it, and finally close it. this happens during message delivery. that doesn't
work on windows, the rename/remove would fail because the file is still open.
so this commit swaps many "remove" and "close" calls. renames are a longer
story: message delivery had two ways to deliver: with "consuming" the
(temporary) message file (which would rename it to its final destination), and
without consuming (by hardlinking the file, falling back to copying). the last
delivery to a recipient of a message (and the only one in the common case of a
single recipient) would consume the message, and the earlier recipients would
not. during delivery, the already open message file was used, to parse the
message. we still want to use that open message file, and the caller now stays
responsible for closing it, but we no longer try to rename (consume) the file.
we always hardlink (or copy) during delivery (this works on windows), and the
caller is responsible for closing and removing (in that order) the original
temporary file. this does cost one syscall more. but it makes the delivery code
(responsibilities) a bit simpler.
there is one more obvious issue: the file system path separator. mox already
used the "filepath" package to join paths in many places, but not everywhere.
and it still used strings with slashes for local file access. with this commit,
the code now uses filepath.FromSlash for path strings with slashes, uses
"filepath" in a few more places where it previously didn't. also switches from
"filepath" to regular "path" package when handling mailbox names in a few
places, because those always use forward slashes, regardless of local file
system conventions. windows can handle forward slashes when opening files, so
test code that passes path strings with forward slashes straight to go stdlib
file i/o functions are left unchanged to reduce code churn. the regular
non-test code, or test code that uses path strings in places other than
standard i/o functions, does have the paths converted for consistent paths
(otherwise we would end up with paths with mixed forward/backward slashes in
log messages).
windows cannot dup a listening socket. for "mox localserve", it isn't
important, and we can work around the issue. the current approach for "mox
serve" (forking a process and passing file descriptors of listening sockets on
"privileged" ports) won't work on windows. perhaps it isn't needed on windows,
and any user can listen on "privileged" ports? that would be welcome.
on windows, os.Open cannot open a directory, so we cannot call Sync on it after
message delivery. a cursory internet search indicates that directories cannot
be synced on windows. the story is probably much more nuanced than that, with
long deep technical details/discussions/disagreement/confusion, like on unix.
for "mox localserve" we can get away with making syncdir a no-op.
increase() and rate() don't seem to assume a previous value of 0 when a vector
gets a first value for a label. you would think that an increase() on a
first-value mox_panic_total{"..."}=1 would return 1, and similar for rate(), but
that doesn't appear to be the behaviour. so we just explicitly initialize the
count to 0 for each possible label value. mox has more vector metrics, but
panics feels like the most important, and it's too much code to initialize them
all, for all combinations of label values. there is probably a better way that
fixes this for all cases...
we match messages to their parents based on the "references" and "in-reply-to"
headers (requiring the same base subject), and in absense of those headers we
also by only base subject (against messages received max 4 weeks ago).
we store a threadid with messages. all messages in a thread have the same
threadid. messages also have a "thread parent ids", which holds all id's of
parent messages up to the thread root. then there is "thread missing link",
which is set when a referenced immediate parent wasn't found (but possibly
earlier ancestors can still be found and will be in thread parent ids".
threads can be muted: newly delivered messages are automatically marked as
read/seen. threads can be marked as collapsed: if set, the webmail collapses
the thread to a single item in the basic threading view (default is to expand
threads). the muted and collapsed fields are copied from their parent on
message delivery.
the threading is implemented in the webmail. the non-threading mode still works
as before. the new default threading mode "unread" automatically expands only
the threads with at least one unread (not seen) meessage. the basic threading
mode "on" expands all threads except when explicitly collapsed (as saved in the
thread collapsed field). new shortcuts for navigation/interaction threads have
been added, e.g. go to previous/next thread root, toggle collapse/expand of
thread (or double click), toggle mute of thread. some previous shortcuts have
changed, see the help for details.
the message threading are added with an explicit account upgrade step,
automatically started when an account is opened. the upgrade is done in the
background because it will take too long for large mailboxes to block account
operations. the upgrade takes two steps: 1. updating all message records in the
database to add a normalized message-id and thread base subject (with "re:",
"fwd:" and several other schemes stripped). 2. going through all messages in
the database again, reading the "references" and "in-reply-to" headers from
disk, and matching against their parents. this second step is also done at the
end of each import of mbox/maildir mailboxes. new deliveries are matched
immediately against other existing messages, currently no attempt is made to
rematch previously delivered messages (which could be useful for related
messages being delivered out of order).
the threading is not yet exposed over imap.
due to a missing return, the content was served again.
this path doesn't happen on release binaries, only during local development,
where there is a local file that can be served.
we only compress if applicable (content-type indicates likely compressible),
client supports it, response doesn't already have a content-encoding).
for internal handlers, we always enable compression. for reverse proxied and
static files, compression must be enabled per handler.
for internal & reverse proxy handlers, we do streaming compression at
"bestspeed" quality (probably level 1).
for static files, we have a cache based on mtime with fixed max size, where we
evict based on least recently used. we compress with the default level (more
cpu, better ratio).
it was far down on the roadmap, but implemented earlier, because it's
interesting, and to help prepare for a jmap implementation. for jmap we need to
implement more client-like functionality than with just imap. internal data
structures need to change. jmap has lots of other requirements, so it's already
a big project. by implementing a webmail now, some of the required data
structure changes become clear and can be made now, so the later jmap
implementation can do things similarly to the webmail code. the webmail
frontend and webmail are written together, making their interface/api much
smaller and simpler than jmap.
one of the internal changes is that we now keep track of per-mailbox
total/unread/unseen/deleted message counts and mailbox sizes. keeping this
data consistent after any change to the stored messages (through the code base)
is tricky, so mox now has a consistency check that verifies the counts are
correct, which runs only during tests, each time an internal account reference
is closed. we have a few more internal "changes" that are propagated for the
webmail frontend (that imap doesn't have a way to propagate on a connection),
like changes to the special-use flags on mailboxes, and used keywords in a
mailbox. more changes that will be required have revealed themselves while
implementing the webmail, and will be implemented next.
the webmail user interface is modeled after the mail clients i use or have
used: thunderbird, macos mail, mutt; and webmails i normally only use for
testing: gmail, proton, yahoo, outlook. a somewhat technical user is assumed,
but still the goal is to make this webmail client easy to use for everyone. the
user interface looks like most other mail clients: a list of mailboxes, a
search bar, a message list view, and message details. there is a top/bottom and
a left/right layout for the list/message view, default is automatic based on
screen size. the panes can be resized by the user. buttons for actions are just
text, not icons. clicking a button briefly shows the shortcut for the action in
the bottom right, helping with learning to operate quickly. any text that is
underdotted has a title attribute that causes more information to be displayed,
e.g. what a button does or a field is about. to highlight potential phishing
attempts, any text (anywhere in the webclient) that switches unicode "blocks"
(a rough approximation to (language) scripts) within a word is underlined
orange. multiple messages can be selected with familiar ui interaction:
clicking while holding control and/or shift keys. keyboard navigation works
with arrows/page up/down and home/end keys, and also with a few basic vi-like
keys for list/message navigation. we prefer showing the text instead of
html (with inlined images only) version of a message. html messages are shown
in an iframe served from an endpoint with CSP headers to prevent dangerous
resources (scripts, external images) from being loaded. the html is also
sanitized, with javascript removed. a user can choose to load external
resources (e.g. images for tracking purposes).
the frontend is just (strict) typescript, no external frameworks. all
incoming/outgoing data is typechecked, both the api request parameters and
response types, and the data coming in over SSE. the types and checking code
are generated with sherpats, which uses the api definitions generated by
sherpadoc based on the Go code. so types from the backend are automatically
propagated to the frontend. since there is no framework to automatically
propagate properties and rerender components, changes coming in over the SSE
connection are propagated explicitly with regular function calls. the ui is
separated into "views", each with a "root" dom element that is added to the
visible document. these views have additional functions for getting changes
propagated, often resulting in the view updating its (internal) ui state (dom).
we keep the frontend compilation simple, it's just a few typescript files that
get compiled (combined and types stripped) into a single js file, no additional
runtime code needed or complicated build processes used. the webmail is served
is served from a compressed, cachable html file that includes style and the
javascript, currently just over 225kb uncompressed, under 60kb compressed (not
minified, including comments). we include the generated js files in the
repository, to keep Go's easily buildable self-contained binaries.
authentication is basic http, as with the account and admin pages. most data
comes in over one long-term SSE connection to the backend. api requests signal
which mailbox/search/messages are requested over the SSE connection. fetching
individual messages, and making changes, are done through api calls. the
operations are similar to imap, so some code has been moved from package
imapserver to package store. the future jmap implementation will benefit from
these changes too. more functionality will probably be moved to the store
package in the future.
the quickstart enables webmail on the internal listener by default (for new
installs). users can enable it on the public listener if they want to. mox
localserve enables it too. to enable webmail on existing installs, add settings
like the following to the listeners in mox.conf, similar to AccountHTTP(S):
WebmailHTTP:
Enabled: true
WebmailHTTPS:
Enabled: true
special thanks to liesbeth, gerben, andrii for early user feedback.
there is plenty still to do, see the list at the top of webmail/webmail.ts.
feedback welcome as always.