2023-01-30 16:27:06 +03:00
|
|
|
/*
|
2023-05-22 15:40:36 +03:00
|
|
|
Package bstore is a database library for storing and querying Go values.
|
2023-01-30 16:27:06 +03:00
|
|
|
|
|
|
|
Bstore is designed as a small, pure Go library that still provides most of
|
|
|
|
the common data consistency requirements for modest database use cases. Bstore
|
|
|
|
aims to make basic use of cgo-based libraries, such as sqlite, unnecessary.
|
|
|
|
|
|
|
|
Bstore implements autoincrementing primary keys, indices, default values,
|
|
|
|
enforcement of nonzero, unique and referential integrity constraints, automatic
|
|
|
|
schema updates and a query API for combining filters/sorting/limits. Queries
|
|
|
|
are planned and executed using indices for fast execution where possible.
|
2023-05-22 15:40:36 +03:00
|
|
|
Bstore is designed with the Go type system in mind: you typically don't have to
|
2023-01-30 16:27:06 +03:00
|
|
|
write any (un)marshal code for your types.
|
|
|
|
|
|
|
|
# Field types
|
|
|
|
|
|
|
|
Struct field types currently supported for storing, including pointers to these
|
|
|
|
types, but not pointers to pointers:
|
|
|
|
|
|
|
|
- int (as int32), int8, int16, int32, int64
|
|
|
|
- uint (as uint32), uint8, uint16, uint32, uint64
|
|
|
|
- bool, float32, float64, string, []byte
|
|
|
|
- Maps, with keys and values of any supported type, except keys with pointer types.
|
2023-05-22 15:40:36 +03:00
|
|
|
- Slices and arrays, with elements of any supported type.
|
2023-01-30 16:27:06 +03:00
|
|
|
- time.Time
|
|
|
|
- Types that implement binary.MarshalBinary and binary.UnmarshalBinary, useful
|
|
|
|
for struct types with state in private fields. Do not change the
|
|
|
|
(Un)marshalBinary method in an incompatible way without a data migration.
|
|
|
|
- Structs, with fields of any supported type.
|
|
|
|
|
|
|
|
Note: int and uint are stored as int32 and uint32, for compatibility of database
|
|
|
|
files between 32bit and 64bit systems. Where possible, use explicit (u)int32 or
|
|
|
|
(u)int64 types.
|
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Cyclic types are supported, but cyclic data is not. Attempting to store cyclic
|
|
|
|
data will likely result in a stack overflow panic.
|
|
|
|
|
|
|
|
Anonymous struct fields are handled by taking in each of the anonymous struct's
|
|
|
|
fields as a type's own fields. The named embedded type is not part of the type
|
|
|
|
schema, and with a Query it can currently only be used with UpdateField and
|
|
|
|
UpdateFields, not for filtering.
|
2023-01-30 16:27:06 +03:00
|
|
|
|
|
|
|
Bstore embraces the use of Go zero values. Use zero values, possibly pointers,
|
|
|
|
where you would use NULL values in SQL.
|
|
|
|
|
|
|
|
# Struct tags
|
|
|
|
|
|
|
|
The typical Go struct can be stored in the database. The first field of a
|
2023-05-22 15:40:36 +03:00
|
|
|
struct type is its primary key, must always be unique, and in case of an
|
|
|
|
integer type the insertion of a zero value automatically changes it to the next
|
|
|
|
sequence number by default. Additional behaviour can be configured through
|
|
|
|
struct tag "bstore". The values are comma-separated. Typically one word, but
|
|
|
|
some have multiple space-separated words:
|
2023-01-30 16:27:06 +03:00
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
- "-" ignores the field entirely, not stored.
|
2023-01-30 16:27:06 +03:00
|
|
|
- "name <fieldname>", use "fieldname" instead of the Go type field name.
|
|
|
|
- "nonzero", enforces that field values are not the zero value.
|
|
|
|
- "noauto", only valid for integer types, and only for the primary key. By
|
|
|
|
default, an integer-typed primary key will automatically get a next value
|
|
|
|
assigned on insert when it is 0. With noauto inserting a 0 value results in an
|
|
|
|
error. For primary keys of other types inserting the zero value always results
|
|
|
|
in an error.
|
2023-05-22 15:40:36 +03:00
|
|
|
- "index" or "index <field1>+<field2>+<...> [<name>]", adds an index. In the
|
|
|
|
first form, the index is on the field on which the tag is specified, and the
|
|
|
|
index name is the same as the field name. In the second form multiple fields can
|
|
|
|
be specified, and an optional name. The first field must be the field on which
|
|
|
|
the tag is specified. The field names are +-separated. The default name for the
|
|
|
|
second form is the same +-separated string but can be set explicitly with the
|
|
|
|
second parameter. An index can only be set for basic integer types, bools, time
|
|
|
|
and strings. A field of slice type can also have an index (but not a unique
|
|
|
|
index, and only one slice field per index), allowing fast lookup of any single
|
|
|
|
value in the slice with Query.FilterIn. Indices are automatically (re)created
|
|
|
|
when registering a type. Fields with a pointer type cannot have an index.
|
|
|
|
String values used in an index cannot contain a \0.
|
|
|
|
- "unique" or "unique <field1>+<field2>+<...> [<name>]", adds an index as with
|
2023-01-30 16:27:06 +03:00
|
|
|
"index" and also enforces a unique constraint. For time.Time the timezone is
|
|
|
|
ignored for the uniqueness check.
|
|
|
|
- "ref <type>", enforces that the value exists as primary key for "type".
|
|
|
|
Field types must match exactly, e.g. you cannot reference an int with an int64.
|
|
|
|
An index is automatically created and maintained for fields with a foreign key,
|
|
|
|
for efficiently checking that removed records in the referenced type are not in
|
|
|
|
use. If the field has the zero value, the reference is not checked. If you
|
|
|
|
require a valid reference, add "nonzero".
|
|
|
|
- "default <value>", replaces a zero value with the specified value on record
|
|
|
|
insert. Special value "now" is recognized for time.Time as the current time.
|
|
|
|
Times are parsed as time.RFC3339 otherwise. Supported types: bool
|
|
|
|
("true"/"false"), integers, floats, strings. Value is not quoted and no escaping
|
|
|
|
of special characters, like the comma that separates struct tag words, is
|
2023-05-22 15:40:36 +03:00
|
|
|
possible. Defaults are also replaced on fields in nested structs, slices
|
|
|
|
and arrays, but not in maps.
|
2023-01-30 16:27:06 +03:00
|
|
|
- "typename <name>", override name of the type. The name of the Go type is
|
|
|
|
used by default. Can only be present on the first field (primary key).
|
|
|
|
Useful for doing schema updates.
|
|
|
|
|
|
|
|
# Schema updates
|
|
|
|
|
|
|
|
Before using a Go type, you must register it for use with the open database by
|
2023-05-22 15:40:36 +03:00
|
|
|
passing a (possibly zero) value of that type to the Open or Register functions.
|
|
|
|
For each type, a type definition is stored in the database. If a type has an
|
|
|
|
updated definition since the previous database open, a new type definition is
|
|
|
|
added to the database automatically and any required modifications are made and
|
|
|
|
checked: Indexes (re)created, fields added/removed, new
|
|
|
|
nonzero/unique/reference constraints validated.
|
|
|
|
|
|
|
|
As a special case, you can change field types between pointer and non-pointer
|
2023-01-30 16:27:06 +03:00
|
|
|
types. With one exception: changing from pointer to non-pointer where the type
|
improve training of junk filter
before, we used heuristics to decide when to train/untrain a message as junk or
nonjunk: the message had to be seen, be in certain mailboxes. then if a message
was marked as junk, it was junk. and otherwise it was nonjunk. this wasn't good
enough: you may want to keep some messages around as neither junk or nonjunk.
and that wasn't possible.
ideally, we would just look at the imap $Junk and $NotJunk flags. the problem
is that mail clients don't set these flags, or don't make it easy. thunderbird
can set the flags based on its own bayesian filter. it has a shortcut for
marking Junk and moving it to the junk folder (good), but the counterpart of
notjunk only marks a message as notjunk without showing in the UI that it was
marked as notjunk. there is also no "move and mark as notjunk" mechanism. e.g.
"archive" does not mark a message as notjunk. ios mail and mutt don't appear to
have any way to see or change the $Junk and $NotJunk flags.
what email clients do have is the ability to move messages to other
mailboxes/folders. so mox now has a mechanism that allows you to configure
mailboxes that automatically set $Junk or $NotJunk (or clear both) when a
message is moved/copied/delivered to that folder. e.g. a mailbox called junk or
spam or rejects marks its messags as junk. inbox, postmaster, dmarc, tlsrpt,
neutral* mark their messages as neither junk or notjunk. other folders mark
their messages as notjunk. e.g. list/*, archive. this functionality is
optional, but enabled with the quickstart and for new accounts.
also, mox now keeps track of the previous training of a message and will only
untrain/train if needed. before, there probably have been duplicate or missing
(un)trainings.
this also includes a new subcommand "retrain" to recreate the junkfilter for an
account. you should run it after updating to this version. and you should
probably also modify your account config to include the AutomaticJunkFlags.
2023-02-12 01:00:12 +03:00
|
|
|
has a field that must be nonzero is not allowed. The on-disk encoding will not be
|
2023-01-30 16:27:06 +03:00
|
|
|
changed, and nil pointers will turn into zero values, and zero values into nil
|
|
|
|
pointers. Also see section Limitations about pointer types.
|
|
|
|
|
|
|
|
Because named embed structs are not part of the type definition, you can
|
|
|
|
wrap/unwrap fields into a embed/anonymous struct field. No new type definition
|
|
|
|
is created.
|
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Some schema conversions are not allowed. In some cases due to architectural
|
|
|
|
limitations. In some cases because the constraint checks haven't been
|
|
|
|
implemented yet, or the parsing code does not yet know how to parse the old
|
|
|
|
on-disk values into the updated Go types. If you need a conversion that is not
|
|
|
|
supported, you will need to write a manual conversion, and you would have to
|
|
|
|
keep track whether the update has been executed.
|
|
|
|
|
|
|
|
Changes that are allowed:
|
|
|
|
|
|
|
|
- From smaller to larger integer types (same signedness).
|
|
|
|
- Removal of "noauto" on primary keys (always integer types). This updates the
|
|
|
|
"next sequence" counter automatically to continue after the current maximum
|
|
|
|
value.
|
|
|
|
- Adding/removing/modifying an index, including a unique index. When a unique
|
|
|
|
index is added, the current records are verified to be unique.
|
|
|
|
- Adding/removing a reference. When a reference is added, the current records
|
|
|
|
are verified to be valid references.
|
|
|
|
- Add/remove a nonzero constraint. Existing records are verified.
|
|
|
|
|
|
|
|
Conversions that are not currently allowed, but may be in the future:
|
|
|
|
|
|
|
|
- Signedness of integer types. With a one-time check that old values fit in the new
|
|
|
|
type, this could be allowed in the future.
|
|
|
|
- Conversions between basic types: strings, []byte, integers, floats, boolean.
|
|
|
|
Checks would have to be added for some of these conversions. For example,
|
|
|
|
from string to integer: the on-disk string values would have to be valid
|
|
|
|
integers.
|
|
|
|
- Types of primary keys cannot be changed, also not from one integer type to a
|
|
|
|
wider integer type of same signedness.
|
|
|
|
|
|
|
|
# BoltDB and storage
|
|
|
|
|
|
|
|
BoltDB is used as underlying storage. BoltDB stores key/values in a single
|
|
|
|
file, in multiple/nested buckets (namespaces) in a B+tree and provides ACID
|
|
|
|
transactions. Either a single write transaction or multiple read-only
|
|
|
|
transactions can be active at a time. Do not start a blocking read-only
|
|
|
|
transaction while holding a writable transaction or vice versa, this will cause
|
|
|
|
deadlock.
|
|
|
|
|
|
|
|
BoltDB returns Go values that are memory mapped to the database file. This
|
|
|
|
means BoltDB/bstore database files cannot be transferred between machines with
|
|
|
|
different endianness. BoltDB uses explicit widths for its types, so files can
|
|
|
|
be transferred between 32bit and 64bit machines of same endianness. While
|
add webmail
it was far down on the roadmap, but implemented earlier, because it's
interesting, and to help prepare for a jmap implementation. for jmap we need to
implement more client-like functionality than with just imap. internal data
structures need to change. jmap has lots of other requirements, so it's already
a big project. by implementing a webmail now, some of the required data
structure changes become clear and can be made now, so the later jmap
implementation can do things similarly to the webmail code. the webmail
frontend and webmail are written together, making their interface/api much
smaller and simpler than jmap.
one of the internal changes is that we now keep track of per-mailbox
total/unread/unseen/deleted message counts and mailbox sizes. keeping this
data consistent after any change to the stored messages (through the code base)
is tricky, so mox now has a consistency check that verifies the counts are
correct, which runs only during tests, each time an internal account reference
is closed. we have a few more internal "changes" that are propagated for the
webmail frontend (that imap doesn't have a way to propagate on a connection),
like changes to the special-use flags on mailboxes, and used keywords in a
mailbox. more changes that will be required have revealed themselves while
implementing the webmail, and will be implemented next.
the webmail user interface is modeled after the mail clients i use or have
used: thunderbird, macos mail, mutt; and webmails i normally only use for
testing: gmail, proton, yahoo, outlook. a somewhat technical user is assumed,
but still the goal is to make this webmail client easy to use for everyone. the
user interface looks like most other mail clients: a list of mailboxes, a
search bar, a message list view, and message details. there is a top/bottom and
a left/right layout for the list/message view, default is automatic based on
screen size. the panes can be resized by the user. buttons for actions are just
text, not icons. clicking a button briefly shows the shortcut for the action in
the bottom right, helping with learning to operate quickly. any text that is
underdotted has a title attribute that causes more information to be displayed,
e.g. what a button does or a field is about. to highlight potential phishing
attempts, any text (anywhere in the webclient) that switches unicode "blocks"
(a rough approximation to (language) scripts) within a word is underlined
orange. multiple messages can be selected with familiar ui interaction:
clicking while holding control and/or shift keys. keyboard navigation works
with arrows/page up/down and home/end keys, and also with a few basic vi-like
keys for list/message navigation. we prefer showing the text instead of
html (with inlined images only) version of a message. html messages are shown
in an iframe served from an endpoint with CSP headers to prevent dangerous
resources (scripts, external images) from being loaded. the html is also
sanitized, with javascript removed. a user can choose to load external
resources (e.g. images for tracking purposes).
the frontend is just (strict) typescript, no external frameworks. all
incoming/outgoing data is typechecked, both the api request parameters and
response types, and the data coming in over SSE. the types and checking code
are generated with sherpats, which uses the api definitions generated by
sherpadoc based on the Go code. so types from the backend are automatically
propagated to the frontend. since there is no framework to automatically
propagate properties and rerender components, changes coming in over the SSE
connection are propagated explicitly with regular function calls. the ui is
separated into "views", each with a "root" dom element that is added to the
visible document. these views have additional functions for getting changes
propagated, often resulting in the view updating its (internal) ui state (dom).
we keep the frontend compilation simple, it's just a few typescript files that
get compiled (combined and types stripped) into a single js file, no additional
runtime code needed or complicated build processes used. the webmail is served
is served from a compressed, cachable html file that includes style and the
javascript, currently just over 225kb uncompressed, under 60kb compressed (not
minified, including comments). we include the generated js files in the
repository, to keep Go's easily buildable self-contained binaries.
authentication is basic http, as with the account and admin pages. most data
comes in over one long-term SSE connection to the backend. api requests signal
which mailbox/search/messages are requested over the SSE connection. fetching
individual messages, and making changes, are done through api calls. the
operations are similar to imap, so some code has been moved from package
imapserver to package store. the future jmap implementation will benefit from
these changes too. more functionality will probably be moved to the store
package in the future.
the quickstart enables webmail on the internal listener by default (for new
installs). users can enable it on the public listener if they want to. mox
localserve enables it too. to enable webmail on existing installs, add settings
like the following to the listeners in mox.conf, similar to AccountHTTP(S):
WebmailHTTP:
Enabled: true
WebmailHTTPS:
Enabled: true
special thanks to liesbeth, gerben, andrii for early user feedback.
there is plenty still to do, see the list at the top of webmail/webmail.ts.
feedback welcome as always.
2023-08-07 22:57:03 +03:00
|
|
|
BoltDB returns read-only memory mapped byte slices, bstore only ever returns
|
2023-05-22 15:40:36 +03:00
|
|
|
parsed/copied regular writable Go values that require no special programmer
|
|
|
|
attention.
|
|
|
|
|
|
|
|
For each Go type opened for a database file, bstore ensures a BoltDB bucket
|
|
|
|
exists with two subbuckets:
|
|
|
|
|
|
|
|
- "types", with type descriptions of the stored records. Each time the database
|
|
|
|
file is opened with a modified Go type (add/removed/modified
|
|
|
|
field/type/bstore struct tag), a new type description is automatically added,
|
|
|
|
identified by sequence number.
|
|
|
|
- "records", containing all data, with the type's primary key as BoltDB key,
|
|
|
|
and the encoded remaining fields as value. The encoding starts with a
|
|
|
|
reference to a type description.
|
|
|
|
|
|
|
|
For each index, another subbucket is created, its name starting with "index.".
|
|
|
|
The stored keys consist of the index fields followed by the primary key, and an
|
|
|
|
empty value.
|
2023-01-30 16:27:06 +03:00
|
|
|
|
|
|
|
# Limitations
|
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Bstore has limitations, not all of which are architectural so may be fixed in
|
|
|
|
the future.
|
|
|
|
|
2023-01-30 16:27:06 +03:00
|
|
|
Bstore does not implement the equivalent of SQL joins, aggregates, and many
|
|
|
|
other concepts.
|
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Filtering/comparing/sorting on pointer fields is not allowed. Pointer fields
|
|
|
|
cannot have a (unique) index. Use non-pointer values with the zero value as the
|
|
|
|
equivalent of a nil pointer.
|
|
|
|
|
|
|
|
The first field of a stored struct is always the primary key. Autoincrement is
|
|
|
|
only available for the primary key.
|
|
|
|
|
|
|
|
BoltDB opens the database file with a lock. Only one process can have the
|
|
|
|
database open at a time.
|
|
|
|
|
|
|
|
An index stored on disk in BoltDB can consume more disk space than other
|
|
|
|
database systems would: For each record, the indexed field(s) and primary key
|
|
|
|
are stored in full. Because bstore uses BoltDB as key/value store, and doesn't
|
|
|
|
manage disk pages itself, it cannot as efficiently pack an index page with many
|
|
|
|
records.
|
2023-01-30 16:27:06 +03:00
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Interface values cannot be stored. This would require storing the type along
|
|
|
|
with the value. Instead, use a type that is a BinaryMarshaler.
|
2023-01-30 16:27:06 +03:00
|
|
|
|
2023-05-22 15:40:36 +03:00
|
|
|
Values of builtin type "complex" cannot be stored.
|
2023-01-30 16:27:06 +03:00
|
|
|
*/
|
|
|
|
package bstore
|