Our SQLite DB requires UTF-8 strings but the filenames from the file
system could use any encoding. Ideally we'd try to find the right
encoding, but we're already using `str.encode(errors="replace").decode()`
in other places to handle filenames that don't use UTF-8 encoding and so
this is a pragmatic solution that should work for all cases, even though
we loose information - all non-utf8 chars will be converted to "?".
An iteration on this could use an encoding detection mechanism.
We recalculate the stored dir size for every dir to be the sum of all
ancestors, so comparing the dir size as reported by the file system will
usually always differ.
We make sure all filenames are using UTF-8 encoding before committing
them to the database. For filenames using a different encoding we
convert offending chars to `?` (using `str.encode(errors="replace")`).
When trying to match files from the local filesystem with files from the
database we thus have to use the mangled filenames, not the original
ones.
This fundamentally changes the way we use size information for dirs in
Metadex.
The implementation is currently hardly optimized and thus might
considerably slow down the application in some circumstances.
This fixes an issue with printing some Unicode characters during scan
operations which take up more than 1 terminal cell, which would cause
line breaks or other garbage to remain in the status output line.