On Sat, Sep 17, 2011 at 09:53:31AM -0700, Alex Nelson wrote:
Basically, if any of the characters passed to a hivexml.c function
aren't isprint()-able, they're base64-encoded.
The characters passed rely on the iconv library. Whatever comes out of the processing of
iconv() in "UTF-8 or UTF-16" mode is what's passed to hivexml. I checked
all the UTF-processing points and have extensive notes in the postscript, but
essentially:
* Key and value names aren't run through iconv
* Value string data and the hive header name are
* The patch below runs a stricter sanitization than iconv, also running everything
through isprint.
I agree with Simson that this gets tricky.
You might consider changing hivexml so it avoids APIs like
hivex_value_string, hivex_value_multiple_strings, etc. (or the
corresponding visitor callbacks). It should probably only use
hivex_value_value and do all character conversion itself.
There's also a big difference between what Windows is documented as
doing and what it is doing in reality ...
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org