On Wed, Aug 10, 2011 at 01:24:05PM -0700, Alex Nelson wrote:
On Aug 10, 2011, at 12:55 , Richard W.M. Jones wrote:
> On Wed, Aug 10, 2011 at 12:07:23PM -0700, Alex Nelson wrote:
>> The infrastructure for modified-time reporting has been essentially
>> unused. These changes report the registry time by treating the
>> time fields as Windows filetime fields stored in little-Endian
>> (which means they can be treated as a single 64-bit little-Endian
>> integer).
>>
>> This patch adds the node_mtime function to the visitor API.
>
> Nearly there. See my comments below.
>
>> @@ -93,6 +94,7 @@ struct hive_h {
>> /* Fields from the header, extracted from little-endianness hell. */
>> size_t rootoffs; /* Root key offset (always an nk-block). */
>> size_t endpages; /* Offset of end of pages. */
>> + char *last_modified; /* mtime of base block. */
>
> This field seems to be unused except in debugging messages? Unless
> you're going to use it, I'd just omit this field and all the code that
> generates and prints last_modified.
There is a spot where it is printed: in lib/hivex.c:hivex__visit_node, "Report
extra for hive's root node".
Hmmm so that callback is called twice for the root node? Stateful
callbacks are complicated for callers to handle.
This would be simpler if there were just two extra API functions,
something like:
int64_t hivex_last_modified (hive_h *);
int64_t hivex_node_timestamp (hive_h *, hive_node_h);
where the first returns the last_modified field for the whole hive and
the second returns the timestamp for a particular node.
You don't need to modify the hivex_visitor struct, because the
node_start callback could just call hivex_node_timestamp.
> This function is an oddity. You're taking a useful thing
(64 bit
> Windows filetime) and converting it to a string. So callers are going
> to need to parse this string if they want to do anything useful with
> it. I think it's best just to return the 64 bit Windows filetime (but
> with the correct endianness) and let callers deal with it. Passing
> back strings from a C API just seems wrong.
These changes are bringing the hivexml program into a file system
analysis suite that deals with many different file system types,
each with their own timestamp recording quirks, and even some file
formats which have yet more quirks. We think that ISO 8601 is the
best umbrella output format, with an additional XML attribute noting
the time granularity (like FAT's 2-second and 1-day granularities).
That's why we're outputting strings in C, which, yes, feels wrong,
but simplifies parsing outside of the scope of hivexml. We're
dealing with the time presentation proactively.
Well I'm sure it slightly simplifies your use case, but it doesn't
make things simpler for other calls of hivex.
There are four things that we might return:
(1) time_t: Native for Unix, but loses precision.
(2) A string: Hard to parse from C.
(3) time_t + fraction (eg. struct timespec). The generator can't
currently handle this type. In any case it's problematic because it
returns more precision than is really there.
(4) A Windows filetime: Preserves complete fidelity in both
directions.
> Add this function to the API (in generator/generator.ml,
'functions').
> This will ensure that Perl, OCaml etc bindings + documentation are
> generated automatically for the function.
I hope that adding that function only entailed copy-pasting the list
entry for node_name, assuming you accept the string return type. I
have the next patch version formatted, will send after the above
discussion's resolved.
For hivex_node_timestamp: Have a look at "node_name" and
"value_qword", and combine the two.
For hivex_last_modified: Combine "value_qword" and "root".
The generator takes most of the hard work out it.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
New in Fedora 11: Fedora Windows cross-compiler. Compile Windows
programs, test, and build Windows installers. Over 70 libraries supprt'd
http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw