Discussion:
mktime problems when adjusting tm_isdst
Paul Eggert
1995-10-10 04:10:58 UTC
Permalink
With TZ=Europe/Paris, when I invoke mktime on the equivalent of
`1996-01-01 00:00:00 tm_isdst=1', it returns the equivalent of
`1995-12-31 23:09:21 tm_isdst=0'. Surely the result should be
`1996-12-31 23:00:00 tm_isdst=0'.

(This is tzcode95d compiled on Solaris 2.4.)

The problem seems to be that mktime discovers that the requested time
exists only in standard time, so it corrects it by subtracting the
first known GMT offset using DST and adding the first known GMT offset
not using DST. But Europe/Paris currently uses different DST rules
(and a different GMT offset) than it did in 1916 and 1911, respectively.

The obvious workaround for this case is to change time1() so that it
uses the nearest DST and non-DST transitions just before the requested
time, instead of the first possible DST and non-DST transitions. But
this breaks down in other cases. For example, in Europe/Paris at
1940-06-15 00:00:00, the GMT offset is 2 hours, tm_isdst=1 and there's
a 1-hour DST offset, but at the most recent non-DST period the GMT
offset is 0 hours, so mktime will wrongly adjust the time by 2 hours.

There's a deeper problem here. How should mktime adjust for DST in a
locale that has multiple DST offsets? For example, suppose
TZ=America/St_Johns and I invoke mktime on the equivalent of
`1989-01-01 00:00:00 tm_isdst=1'. Newfoundland used a 2-hour DST
offset in summer 1988 and a 1-hour DST offset in summer 1989, so
should the requested time be adjusted by 1 hour or by 2?

One possible way to address the deeper problem, suggested by Mark
Brader, is to represent different DST offsets by using different
tm_isdst values. For example, instead of simply being 1 when some
sort of DST is in effect, tm_isdst could be the DST offset, in seconds
modulo INT_MAX+1. That way, mktime can tell which DST offset was used
to derive its input time, and it can adjust the time correctly when it
finds that a different DST offset is needed.

I think this change would make it easy to fix the Europe/Paris bug
mentioned above. But it's a nontrivial change to the tz code and I
worry that it might break some applications that assume that
tm_isdst==1 means ``use normal DST''.


Here's a test program to reproduce the bug describe above. Run it
with TZ set to Europe/Paris as built from tzdata95i.tar.gz, and give
it the arguments `1996-01-01 00:00:00 1'.

#include <stdio.h>
#include <time.h>

#define TM_YEAR_BASE 1900

static void
print_tm (tp)
struct tm *tp;
{
printf ("%04d-%02d-%02d %02d:%02d:%02d yday %03d wday %d isdst %d",
tp->tm_year + TM_YEAR_BASE, tp->tm_mon + 1, tp->tm_mday,
tp->tm_hour, tp->tm_min, tp->tm_sec,
tp->tm_yday, tp->tm_wday, tp->tm_isdst);
}

int
main (argc, argv)
int argc;
char **argv;
{
time_t t;
struct tm tm;
char trailer;

if (argc == 4
&& (sscanf (argv[1], "%d-%d-%d%c",
&tm.tm_year, &tm.tm_mon, &tm.tm_mday, &trailer)
== 3)
&& (sscanf (argv[2], "%d:%d:%d%c",
&tm.tm_hour, &tm.tm_min, &tm.tm_sec, &trailer)
== 3)
&& (sscanf (argv[3], "%d%c", &tm.tm_isdst, &trailer) == 1))
{
tm.tm_year -= TM_YEAR_BASE;
tm.tm_mon--;
t = mktime (&tm);
printf ("mktime returns %ld == ", (long) t);
print_tm (&tm);
printf ("\n");
}
else
printf ("Usage:\t%s YYYY-MM-DD HH:MM:SS ISDST # Test given time.\n",
argv[0]);

return 0;
}
Chris Torek
1995-10-10 07:12:37 UTC
Permalink
... and I worry that it might break some applications that assume
that tm_isdst==1 means ``use normal DST''.
I think there may also be applications that use tm_isdst as a
subscript into a two-element array.

Breaking these could be considered doing them a favor, :-)

Chris
unknown
1995-10-10 14:34:55 UTC
Permalink
We could run through the transition times from future to past (rather than from
past to future as is done now); this doesn't solve the deep problem, but does
get the "TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdt=1" case right. It does
have the advantage of being a fairly minimal change to the code.

--ado

***************
*** 1458,1467 ****
if (sp == NULL)
return WRONG;
#endif /* defined ALL_STATE */
! for (samei = 0; samei < sp->typecnt; ++samei) {
if (sp->ttis[samei].tt_isdst != tmp->tm_isdst)
continue;
! for (otheri = 0; otheri < sp->typecnt; ++otheri) {
if (sp->ttis[otheri].tt_isdst == tmp->tm_isdst)
continue;
tmp->tm_sec += sp->ttis[otheri].tt_gmtoff -
--- 1458,1467 ----
if (sp == NULL)
return WRONG;
#endif /* defined ALL_STATE */
! for (samei = sp->typecnt - 1; samei >= 0; --samei) {
if (sp->ttis[samei].tt_isdst != tmp->tm_isdst)
continue;
! for (otheri = sp->typecnt - 1; otheri >= 0; --otheri) {
if (sp->ttis[otheri].tt_isdst == tmp->tm_isdst)
continue;
tmp->tm_sec += sp->ttis[otheri].tt_gmtoff -
Bradley White
1995-10-10 15:57:01 UTC
Permalink
I guess I would benefit from a succinct explanation of what
mktime(`TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdst=1') is
supposed to mean. Paul's assertion ...
Post by Paul Eggert
Surely the result should be
`1996-12-31 23:00:00 tm_isdst=0'.
..., while perhaps true, doesn't seem patently obvious.

If I took `TZ=Europe/Paris 1995-06-01 00:00:00 tm_isdst=1',
for example, and added six months, I think I would prefer to
see a `1996-01-01 00:00:00 tm_isdst=0' mktime() result.

In any case, a clear definition will tell us what to do with
the code.

Bradley
Robert Elz
1995-10-10 16:33:02 UTC
Permalink
I doubt that any patch can reasonably help here, the tm_isdst
flag as input to mktime() is simply absurdly badly broken as
defined. In anything other than the simplest cases, there is
no rational way to make it work as defined, it would probably
be better for mktime() to either simply ignore it (not great),
or return an error ((time_t)-1) or ((time_t)0) if tm_isdst is
set to the "wrong" value.

The only reason this brain damage exists at all is because of
the silly desire to make arithmetic on times be done by doing
arithmetic on the fields of a struct tm. OK, it is not a good
idea to assume that a time_t is always an arithmetic type, but
a bunch of macro/function definitions for addition/subtraction
of time_t's from each other, and integers to/from time_t's would
have been a much more rational choice, and much easier to use.

Its all great that one should attempt to support the standardised
interfaces, however brain dead they actually are, but having
attempted that, and discovered that there is simply no way to
implement it correctly, there comes a time when the right answer
is simply to abandon the attempt, and not continue to pretend
that it can be done, and rely on people not finding the hard
cases where it can't.

The case where tm_isdst == 1, and summer time is not actually in
effect, is basically asking mktime() to invent policy. The
request to it is "If daylight saving had been in effect at
this time I am telling you, the time would have been this. So,
what is it really?" This is meaningless, it requires knowledge
as to what the non-existant daylight time shift would have been,
had it existed. One can invent any offset at all for that, for
all anyone knows for sure, the answer Paul originally sent for
the conversion that "went wrong" might be perfectly correct.

Note however that the other way (tm_isdst == 0) can be handled,
as that is saying "this time is standard time for the zone,
correct it for any DST that might have been in effect", and
as long as the time is one in the past, or very near future,
it is possible (times in the far future, where far is anything
beyond about 3 months, are always going to be impossible).

If any patch is needed, simply make it be an error - or if you
want to allow a little more latitude, you may be willing to
assume that when the nearest DST before the time involved, and
the nearest after the time involved are the same, then it is
acceptable to assume that the same offset is what would have
applied in the gap between, had there been DST then, and only
return the error in cases where the offset was not the same in
the two cases.

kre
Paul Eggert
1995-10-10 17:22:55 UTC
Permalink
Date: Tue, 10 Oct 1995 11:57:01 -0400 (EDT)
From: Bradley White <bww at fore.com>

I would benefit from a succinct explanation of what
mktime(`TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdst=1') is
supposed to mean.

The C Standard is vague about this. mktime is supposed to adjust
out-of-range values, but the standard doesn't say how it should adjust
things when tm_isdst is out of range.

If I took `TZ=Europe/Paris 1995-06-01 00:00:00 tm_isdst=1',
for example, and added six months, I think I would prefer to
see a `1996-01-01 00:00:00 tm_isdst=0' mktime() result.

At least one implementation agrees with you (the GNU C library mktime),
but the tz tradition is to assume that when you say ``six months'' you
mean ``six months, except that if the DST offset changes during those
six months, subtract that change''.

In some cases (``2 hours from now'') the tz method is more natural,
but in others (``2 days from now'') the ignore-bogus-tm_isdst method
is more natural. The ignore-bogus-tm_isdst method deals more
consistently with multiple DST offsets and with changes to the base
UTC offset, but there is more implementation experience with the tz
method.

For (too much) more on this controversy, please see the `mktime murk'
thread in the comp.std.c newsgroup,
e.g. <news:1995Oct5.054100.3591 at sq.com>, <news:DFzFA4.Io0 at root.co.uk>,
<news:45c0ee$3n9 at light.twinsun.com>. Leap seconds come into play too,
of course!
Paul Eggert
1995-10-10 04:10:58 UTC
Permalink
With TZ=Europe/Paris, when I invoke mktime on the equivalent of
`1996-01-01 00:00:00 tm_isdst=1', it returns the equivalent of
`1995-12-31 23:09:21 tm_isdst=0'. Surely the result should be
`1996-12-31 23:00:00 tm_isdst=0'.

(This is tzcode95d compiled on Solaris 2.4.)

The problem seems to be that mktime discovers that the requested time
exists only in standard time, so it corrects it by subtracting the
first known GMT offset using DST and adding the first known GMT offset
not using DST. But Europe/Paris currently uses different DST rules
(and a different GMT offset) than it did in 1916 and 1911, respectively.

The obvious workaround for this case is to change time1() so that it
uses the nearest DST and non-DST transitions just before the requested
time, instead of the first possible DST and non-DST transitions. But
this breaks down in other cases. For example, in Europe/Paris at
1940-06-15 00:00:00, the GMT offset is 2 hours, tm_isdst=1 and there's
a 1-hour DST offset, but at the most recent non-DST period the GMT
offset is 0 hours, so mktime will wrongly adjust the time by 2 hours.

There's a deeper problem here. How should mktime adjust for DST in a
locale that has multiple DST offsets? For example, suppose
TZ=America/St_Johns and I invoke mktime on the equivalent of
`1989-01-01 00:00:00 tm_isdst=1'. Newfoundland used a 2-hour DST
offset in summer 1988 and a 1-hour DST offset in summer 1989, so
should the requested time be adjusted by 1 hour or by 2?

One possible way to address the deeper problem, suggested by Mark
Brader, is to represent different DST offsets by using different
tm_isdst values. For example, instead of simply being 1 when some
sort of DST is in effect, tm_isdst could be the DST offset, in seconds
modulo INT_MAX+1. That way, mktime can tell which DST offset was used
to derive its input time, and it can adjust the time correctly when it
finds that a different DST offset is needed.

I think this change would make it easy to fix the Europe/Paris bug
mentioned above. But it's a nontrivial change to the tz code and I
worry that it might break some applications that assume that
tm_isdst==1 means ``use normal DST''.


Here's a test program to reproduce the bug describe above. Run it
with TZ set to Europe/Paris as built from tzdata95i.tar.gz, and give
it the arguments `1996-01-01 00:00:00 1'.

#include <stdio.h>
#include <time.h>

#define TM_YEAR_BASE 1900

static void
print_tm (tp)
struct tm *tp;
{
printf ("%04d-%02d-%02d %02d:%02d:%02d yday %03d wday %d isdst %d",
tp->tm_year + TM_YEAR_BASE, tp->tm_mon + 1, tp->tm_mday,
tp->tm_hour, tp->tm_min, tp->tm_sec,
tp->tm_yday, tp->tm_wday, tp->tm_isdst);
}

int
main (argc, argv)
int argc;
char **argv;
{
time_t t;
struct tm tm;
char trailer;

if (argc == 4
&& (sscanf (argv[1], "%d-%d-%d%c",
&tm.tm_year, &tm.tm_mon, &tm.tm_mday, &trailer)
== 3)
&& (sscanf (argv[2], "%d:%d:%d%c",
&tm.tm_hour, &tm.tm_min, &tm.tm_sec, &trailer)
== 3)
&& (sscanf (argv[3], "%d%c", &tm.tm_isdst, &trailer) == 1))
{
tm.tm_year -= TM_YEAR_BASE;
tm.tm_mon--;
t = mktime (&tm);
printf ("mktime returns %ld == ", (long) t);
print_tm (&tm);
printf ("\n");
}
else
printf ("Usage:\t%s YYYY-MM-DD HH:MM:SS ISDST # Test given time.\n",
argv[0]);

return 0;
}
Chris Torek
1995-10-10 07:12:37 UTC
Permalink
... and I worry that it might break some applications that assume
that tm_isdst==1 means ``use normal DST''.
I think there may also be applications that use tm_isdst as a
subscript into a two-element array.

Breaking these could be considered doing them a favor, :-)

Chris
unknown
1995-10-10 14:34:55 UTC
Permalink
We could run through the transition times from future to past (rather than from
past to future as is done now); this doesn't solve the deep problem, but does
get the "TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdt=1" case right. It does
have the advantage of being a fairly minimal change to the code.

--ado

***************
*** 1458,1467 ****
if (sp == NULL)
return WRONG;
#endif /* defined ALL_STATE */
! for (samei = 0; samei < sp->typecnt; ++samei) {
if (sp->ttis[samei].tt_isdst != tmp->tm_isdst)
continue;
! for (otheri = 0; otheri < sp->typecnt; ++otheri) {
if (sp->ttis[otheri].tt_isdst == tmp->tm_isdst)
continue;
tmp->tm_sec += sp->ttis[otheri].tt_gmtoff -
--- 1458,1467 ----
if (sp == NULL)
return WRONG;
#endif /* defined ALL_STATE */
! for (samei = sp->typecnt - 1; samei >= 0; --samei) {
if (sp->ttis[samei].tt_isdst != tmp->tm_isdst)
continue;
! for (otheri = sp->typecnt - 1; otheri >= 0; --otheri) {
if (sp->ttis[otheri].tt_isdst == tmp->tm_isdst)
continue;
tmp->tm_sec += sp->ttis[otheri].tt_gmtoff -
Bradley White
1995-10-10 15:57:01 UTC
Permalink
I guess I would benefit from a succinct explanation of what
mktime(`TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdst=1') is
supposed to mean. Paul's assertion ...
Post by Paul Eggert
Surely the result should be
`1996-12-31 23:00:00 tm_isdst=0'.
..., while perhaps true, doesn't seem patently obvious.

If I took `TZ=Europe/Paris 1995-06-01 00:00:00 tm_isdst=1',
for example, and added six months, I think I would prefer to
see a `1996-01-01 00:00:00 tm_isdst=0' mktime() result.

In any case, a clear definition will tell us what to do with
the code.

Bradley
Robert Elz
1995-10-10 16:33:02 UTC
Permalink
I doubt that any patch can reasonably help here, the tm_isdst
flag as input to mktime() is simply absurdly badly broken as
defined. In anything other than the simplest cases, there is
no rational way to make it work as defined, it would probably
be better for mktime() to either simply ignore it (not great),
or return an error ((time_t)-1) or ((time_t)0) if tm_isdst is
set to the "wrong" value.

The only reason this brain damage exists at all is because of
the silly desire to make arithmetic on times be done by doing
arithmetic on the fields of a struct tm. OK, it is not a good
idea to assume that a time_t is always an arithmetic type, but
a bunch of macro/function definitions for addition/subtraction
of time_t's from each other, and integers to/from time_t's would
have been a much more rational choice, and much easier to use.

Its all great that one should attempt to support the standardised
interfaces, however brain dead they actually are, but having
attempted that, and discovered that there is simply no way to
implement it correctly, there comes a time when the right answer
is simply to abandon the attempt, and not continue to pretend
that it can be done, and rely on people not finding the hard
cases where it can't.

The case where tm_isdst == 1, and summer time is not actually in
effect, is basically asking mktime() to invent policy. The
request to it is "If daylight saving had been in effect at
this time I am telling you, the time would have been this. So,
what is it really?" This is meaningless, it requires knowledge
as to what the non-existant daylight time shift would have been,
had it existed. One can invent any offset at all for that, for
all anyone knows for sure, the answer Paul originally sent for
the conversion that "went wrong" might be perfectly correct.

Note however that the other way (tm_isdst == 0) can be handled,
as that is saying "this time is standard time for the zone,
correct it for any DST that might have been in effect", and
as long as the time is one in the past, or very near future,
it is possible (times in the far future, where far is anything
beyond about 3 months, are always going to be impossible).

If any patch is needed, simply make it be an error - or if you
want to allow a little more latitude, you may be willing to
assume that when the nearest DST before the time involved, and
the nearest after the time involved are the same, then it is
acceptable to assume that the same offset is what would have
applied in the gap between, had there been DST then, and only
return the error in cases where the offset was not the same in
the two cases.

kre
Paul Eggert
1995-10-10 17:22:55 UTC
Permalink
Date: Tue, 10 Oct 1995 11:57:01 -0400 (EDT)
From: Bradley White <bww at fore.com>

I would benefit from a succinct explanation of what
mktime(`TZ=Europe/Paris 1996-01-01 00:00:00 tm_isdst=1') is
supposed to mean.

The C Standard is vague about this. mktime is supposed to adjust
out-of-range values, but the standard doesn't say how it should adjust
things when tm_isdst is out of range.

If I took `TZ=Europe/Paris 1995-06-01 00:00:00 tm_isdst=1',
for example, and added six months, I think I would prefer to
see a `1996-01-01 00:00:00 tm_isdst=0' mktime() result.

At least one implementation agrees with you (the GNU C library mktime),
but the tz tradition is to assume that when you say ``six months'' you
mean ``six months, except that if the DST offset changes during those
six months, subtract that change''.

In some cases (``2 hours from now'') the tz method is more natural,
but in others (``2 days from now'') the ignore-bogus-tm_isdst method
is more natural. The ignore-bogus-tm_isdst method deals more
consistently with multiple DST offsets and with changes to the base
UTC offset, but there is more implementation experience with the tz
method.

For (too much) more on this controversy, please see the `mktime murk'
thread in the comp.std.c newsgroup,
e.g. <news:1995Oct5.054100.3591 at sq.com>, <news:DFzFA4.Io0 at root.co.uk>,
<news:45c0ee$3n9 at light.twinsun.com>. Leap seconds come into play too,
of course!

Loading...